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Preface 


A  one-day  meeting  in  Mareli  1999  at  Nottingham  was  eonvened  to  explore  techniques 
for  modeling  human  perfonnance  in  synthetic  environments.  A  list  of  participants  is 
available  as  Appendix  A.  The  presentations  serv^ed  as  preliminary  versions  of  some  chapters 
of  this  book.  The  chapters  were  expanded  based  on  the  day’s  discussions,  extended 
reflection,  and  further  infonnal  discussion. 

Unlike  a  very  similar,  earlier  review  (Elkind,  Card,  Hoehberg,  &  Huey,  1990)  that  noted 
the  need  to  develop  theory  before  applying  such  models,  we  are  able  to  conclude  that  the 
models  presented  here  are  available  and  useful.  The  question  remaining  is  how  to  improve 
them.  We  found  that  the  resulting  report  was  usable  as  a  general  update  to  Pew  and  Mavor’s 
(1998)  book,  as  it  reviewed  work  that  was  done  after  their  book.  In  particular,  we  were  able 
to  examine  a  wider  variety  of  cognitive  architectures  developed  outside  the  United  States. 
This  report  also  provides  a  detailed  source  of  further  ideas  and  suggestions  for  projects.  We 
particularly  draw  the  reader’s  attention  to  the  importance  of  the  integration  and  usability  of 
models.  Some  implications  apply  more  to  the  United  Kingdom  and  Australia,  but  nearly  all 
are  general. 

The  report  proved  popular,  so  we  updated  it  and  looked  for  a  publisher  to  help 
disseminate  it  more  widely.  Mike  MeNecse  was  instrumental  in  putting  us  in  touch  with  the 
Human  Systems  Information  Analysis  Center  (HSIAC).  We  are  grateful  to  HSIAC  for 
agreeing  to  publish  this  book  and  preparing  it  for  publication.  Comments  from  Jeffrey  A. 
Landis,  HSIAC  Publications  Manager  and  Editor,  and  Dr.  Michael  Fineberg,  HSIAC  Chief 
Scientist,  have  significantly  improved  this  work.  We  appreciate  their  support. 

Stephen  Croker  and  Peter  Lonsdale  provided  useful  comments  and  helped  assemble 
these  materials.  In  addition  to  the  workshop  participants  listed,  we  thank  Angie  Barnhill, 
J  im  Barnhill,  Christina  Bartl,  Kevin  Gluek,  Simon  Goss,  Ian  Greig,  Robin  Hollands, 
Nicholas  Howden,  Jim  Jansen,  Andrew  Lucas,  Mike  McNeesc,  Emma  Norling,  Ralph 
Ronnquist,  and  Colin  Sheppard  for  their  help  or  comments.  Brian  Logan  and  Aaron  Sloman, 
while  not  listed  as  authors,  did  provide  material  that  substantially  helped  in  the  preparation 
of  the  book.  This  project  was  primarily  supported  by  DERA  (Bedford,  UK)  under  contract 
LSA/E20307,  and  also  by  DSTO  (Australia)  and  later  by  the  (US)  Office  of  Naval  Research 
(contracts  NOOO 1401 10243,  NOOO 1401 10547,  and  N0001402I0021 ).  The  conclusions 
reported  here,  however,  are  solely  the  responsibility  of  the  authors. 

Frank  E.  Ritter 

University  Park,  Pennsylvania 
January  2003 
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CHAPTER  1 


Tasks  and  Objectives  for  Modeling  Behavior  in 
Synthetic  Environments 


There  are  now  numerous  models  of  human  behavior  in  Synthetic  Hnvironments  (SFs), 
and  they  serve  a  multitude  of  uses.  It  is  worthwhile  considering  where  and  how  to  improve 
these  models  to  provice  more  realistic  human  behavior.  This  report  provides  a  more  recent 
review  of  work  following  Pew  and  Mavor  (1 998),  and  provides  a  detailed  source  of  further 
ideas  and  suggestions.  In  addition  to  noting  areas  where  models  could  be  expanded  to 
include  more  complete  performance,  we  particularly  draw  the  reader’s  attention  both  to  the 
importance  of  the  integration  of  models  (and  thus  their  reuse)  and  to  the  usability  of  models. 
We  will  argue  that  improved  usability  (and  reusability)  is  necessary  for  these  models  to 
achieve  their  potential.  We  extend  Pew  and  Mavor’s  results  by  examining  architectures 
(e.g.,  COGENT,  JACK,  hybrid  architectures)  that  were  not  included  or  available  when  Pew 
and  Mavor  compiled  their  report,  and  by  summarizing  several  promising  areas  for  further 
work  that  have  arisen  recently. 

This  report  reflects  the  biases  and  specific  expertise  of  the  authors  as  they  attempt  to 
identify  a  wide  range  of  potential  problems  and  provide  possible  solutions.  Some  of  the 
proposed  projects  are  high  risk  and  not  all  of  the  authors  agree  that  these  projects  can  be 
accomplished.  All  agree,  however,  that  if  possible,  they  would  be  rewarding.  Given  the 
diversity  of  human  behavior,  there  remain  many  issues  not  covered  here.  For  example,  many 
aspects  of  teamwork  are  important  but  not  examined  here.  Most  of  the  systems  and 
architectures  reported  here  are  continually  evolving.  Because  of  the  rapid  pace  of 
development  in  this  area,  our  review  may  underestimate  the  capabilities  of  these  systems 
and  several  of  our  suggestions  may  already  be  incorporated  in  them. 

1.1  The  Role  of  Synthetic  Forces 

There  are  several  commonly  acknowledged  uses  of  cognitive  models  in  synthetic 
environments.  These  uses  have  included  at  least  the  range  shown  in  Tabic  1.1.  This  is  a 
wide  set.  Pew  and  Mavor  (1998)  focused  on  the  application  of  synthetic  forces  to  training 
partly  because  the  major  applications  and  successes  of  synthetic  forces  have  been  in  this 
domain.  Further  uses  of  synthetic  forces  have  been  outlined  in  other  reviews  (Computer 
Science  and  Telecommunications  Board,  1997;  Lucas  &  Goss,  1999;  Synthetic 
Environments  Management  Board,  1998). 
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Modeling  Human  Performance 


Table  1.1:  Potential  Uses  of  Models  in  Synthetic  Force  Environments 

•  Training  leaders 

•  Joint  and  combined  training 

•  Training  other  personnel  (e.g.,  support  and  logistics) 

•  Testing  existing  doctrine 

•  Testing  possible  future  procurements 

•  Testing  new  doctrine 

•  Serving  as  a  formal,  runnable  description  of  doctrine 


The  user  community  for  synthetic  forces  would  be  better  served  if  all  these  uses  were 
supported  by  a  single  system  or  approach.  Currently,  the  models  of  behavior  in  these 
systems  have  often  been  developed  without  a  long-term  plan,  and  arc  only  usable  within  the 
simulation  for  which  they  were  developed.  Historically,  few  single  systems  have  supported 
more  than  one  or  two  of  the  uses  noted  in  Table  1.1.  This  is  wasteful  and  can  lead  to 
different  behaviors  being  taught  or  used  in  different  simulations  when  they  should  be 
exactly  the  same  behavior.  The  use  of  the  Distributed  Interactive  Simulation  (DIS)  protocol 
for  distributed  simulation  is  a  step  toward  integration,  but  it  does  not  apply  to 
behavior  itself. 

While  having  a  single  system  or  approach  is  highly  desirable,  there  are  good  reasons 
why  multiple  systems  are  currently  used  (in  addition  to  a  multitude  of  bad  reasons  as  well). 
Perhaps  the  most  important  reason  why  there  arc  multiple  models  of  behavior  is  that 
existing  approaches  to  modeling  cannot  support  all  of  the  uses  in  Table  1.1  equally  well. 
Models  that  focus  on  aggregate,  or  large  unit  behavior,  do  not  support  low-level  simulations 
very  well.  Models  that  predict  average  behavior  are  much  less  useful  for  practicing  tactics 
and  procedures.  Models  that  are  good  for  training  provide  detailed  data  that  have  to  be 
extensively  summarized  and  aggregated  to  be  of  use  to  planners.  Planners  and  evaluators, 
for  example,  may  find  useful  data  in  large  simulations  such  as  the  Purple  Link  exercise,  part 
of  STOW97  (further  information  is  available  from  Ccranowicz,  1998,  as  well  as  from 
www.strkxmarmyjnil/STRKX)M/DRSTRKX)M/^^  although  such  simulations  cannot  yet 
be  convened  within  an  afternoon  or  even  a  week  to  examine  how  a  new  platfonn  performs. 
This  report  will  makes  suggestions  on  all  of  these  levels,  but  it  does  not  intend  to  be 
comprehensive. 

1.2  Definition  of  Terms 

There  are  several  terms  used  in  this  report  that  have  meanings  specific  to  the  domain  of 
behavioral  modeling.  The  term  model,  for  example,  will  refer  exclusively  to  cognitive 
models,  and  the  term  “simulation”  will  refer  exclusively  to  task  simulations.  We  review 
these  terms  here,  starting  by  introducing  synthetic  forces.  Modular  Semi-Automated  Forces 
(ModSAF)  is  briefly  explained  to  provide  a  common  system  as  a  point  of  reference.  We 
then  define  the  terms  we  will  use  with  respect  to  models  of  behavior. 
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1.2.1  Synthetic  Forces 

Synthetic  forces  exist  in  military  simulations,  sometimes  alongside  real  forces  that  have 
been  instrumented  and  linked  to  the  simulation.  There  arc  now  synthetic  force  simulations 
covering  all  of  the  armed  services.  Synthetic  forces  can  be  separated  into  two  components, 
physical  and  behavioral.  The  physical  aspects  represent  the  movement  and  state  ofplatfoniis 
(objects)  in  the  simulation,  including  such  aspects  as  maximum  speed  and  the  set  of  actions 
that  can  be  performed  in  the  world.  The  physical  aspects  provide  constraints  on  behavior. 
Simulations  of  the  physical  aspects  arc  fairly  complete  now  for  most  purposes,  although 
they  remain  important  in  their  own  right  (Computer  Science  and  rdecommunications 
Board,  1997;  Synthetic  environments  Management  Board,  1998). 

The  behavioral  aspects  of  a  synthetic  force  platform  detennine  where,  when,  and  how  it 
perfonns  the  physical  actions,  that  is,  its  behavior.  Many  human  and  entity  behaviors  can  be 
simulated,  such  as  movement  and  attack,  but  behavior  has  been  less  veridically  modeled 
than  physical  perfonnance.  I  he  next  step  to  increase  realism  is  not  only  to  include  further 
intelligent  behavior  but  also  to  match  more  closely  the  timing  and  sequence  of  human 
behavior  when  performing  the  same  tasks. 

1.2.2  Modular  Semi-Automated  Forces 

Modular  Semi-Automated  F'orces  (ModSAF)  is  a  system  for  simulating  entities 
(platforms)  on  a  simulated  battlefield  (Loral,  1995).  It  is  perhaps  the  most  widely  used 
behavioral  simulator  in  military  synthetic  environments.  TTic  goal  of  ModSAF  is  to  replicate 
the  behavior  of  simulated  platforms  in  sufficient  detail  to  provide  useful  training  and 
simulation  of  tactics. 

ModSAF  includes  the  ability  to  simulate  the  most  common  types  of  physical  platfomis, 
such  as  a  tank,  and  external  effects  on  those  platforms,  like  weather  and  smoke,  fhe  terrain 
is  defined  in  a  separate  database,  which  is  shared  by  other  simulators  in  the  same  exercise 
using  the  DIS  simulation  protocol.  Multiple  platforms  can  be  simulated  by  a  single 
ModSAF  system. 

The  local  platforms  interact  with  remote  platfonns  by  exchanging  approximately  20 
different  types  of  information  packets.  Examples  of  packet  types  include  announcing  where 
the  platform  is  located  (the  other  platforms  compute  whether  the  originator  can  be  seen), 
where  radar  is  being  emitted,  and  where  shots  are  being  fired.  Thus,  the  features  of  the 
packets  vary.  Each  simulation  is  responsible  for  updating  its  own  position  and  computing 
what  to  do  with  the  information  in  each  packet,  so  that  a  tank  does  not  directly  shoot  another 
tank,  for  example.  Attackers  send  out  projectile  packets  and  the  target  tank  computes  that  it 
would  be  damaged  by  their  projectiles. 

Some  semi'intelligent  behaviors  are  included  in  ModSAF  through  a  set  of  about  20 
different  simple  scripts.  These  scripts  support  such  activities  as  moving  between  two  points, 
hiding,  and  patrolling. 

ModSAF  is  a  large  system.  It  can  be  compiled  into  several  major  versions,  including 
versions  to  test  networks  and  specific  versions  for  each  service.  The  terrain  databases  each 
include  up  to  1  gigabyte  of  data.  In  1999,  simulating  multiple  entities  required  a  relatively 
fast  workstation  (100  MIIz+)  with  a  reasonable  amount  of  memory  (32  MB+). 
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A  major  problem  is  usability  as  ModSAF  is  large  and  has  a  complicated  syntax.  Users 
report  problems  learning  and  using  it.  A  better  way  to  provide  its  functionality  needs  to  be 
found  or  its  usability  needs  to  be  improved  directly. 

1.2.3  Frameworks,  Theories,  Models,  and  Cognitive  Architectures 

It  is  common  in  cognitive  science  to  differentiate  between  several  levels  of  theorizing 
(e.g.,  Anderson,  1 983;  1 993,  chap.  I)  and  defining  these  levels  now  will  help  us  in  the 
remainder  of  this  report.  Framework  refers  to  the  specification  of  a  few  broad  principles, 
with  too  many  details  left  unspecified  to  be  able  to  make  empirical  predictions.  For 
example,  the  idea  that  human  cognition  acts  as  a  production  system  offers  a  framework  for 
studying  the  human  mind. 

Theory  adds  more  precision  to  frameworks,  and  describes  data  structures  and 
mechanisms  that  at  least  allow  qualitative  predictions  to  be  made.  For  example,  the 
production  system  principles  presented  in  Newell  and  Simon  (1972)  form  a  theory  of  human 
cognition. 

Models  are  theories  implemented  as  computer  programs  or  represented  mathematically 
to  apply  to  specific  situations  or  types  of  situations.  While  generally  more  limited  in  their 
domain  of  application  than  theories,  models  typically  provide  more  accurate,  quantitative 
predictions. 

Cognitive  architecture  has  two  meanings:  (1)  specifications  of  the  main  modules  and 
mechanisms  underlying  human  cognition,  and  (2)  the  computer  program  implementing 
these  specifications.  These  meanings  are  separate  and  distinct  but  usually  are  used  as 
equivalent.  Cognitive  architectures,  as  proposed  by  Newell  (1990),  offer  a  platform  for 
developing  cognitive  models  rapidly  while  keeping  the  theoretical  coherence  between  these 
models  intact.  These  cognitive  architectures  are  often  seen  as  equivalent  to  Unified  Tlieories 
of  Cognition  (UTC),  a  way  to  pull  all  that  is  known  about  cognition  into  a  single  theory.  In 
Appendix  B  we  include  brief  descriptions  of  two  commonly  used  cognitive  architectures, 
ACT-R  and  Soar. 

There  exists  no  generally  agreed  definition  of  hybrid  architectures.  Some  use  the  term 
when  a  cognitive  architecture  includes  symbolic  features  (e.g.,  a  production  system)  as  well 
as  non-symbolic  features  (e.g.,  neural  net  spreading  of  activation  among  memory  elements); 
others,  such  as  Pew  and  Mavor  (1998),  use  the  term  when  two  or  more  architectures  of  any 
kind  are  combined  (e.g..  Soar  and  EPIC).  We  use  the  latter  definition  here  because  this  type 
of  hybrid  architecture  has  become  more  important  and  more  frequently  used. 

When  comparing  theoretical  proposals,  it  is  essential  to  keep  in  mind  the  level  at  which 
the  proposals  were  formulated.  Typically,  a  framework  will  cover  a  large  amount  of 
empirical  regularities  without  specifying  many  details,  while  a  model  will  cover  a  small 
amount  of  data  with  great  precision.  It  is  generally  agreed  that  models  arc  more  useful 
scientifically  than  theories  or  frameworks  because  they  make  clear-cut  predictions  that  can 
be  tested  with  empirical  data,  and  hcncc,  are  less  amenable  to  ad  hoc  explanations  (Popper, 
1959).  Models  are,  however,  harder  to  create  and  use. 
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1.3  Summary  of  Modeling  Human  and  Organizational  Behavior 

While  the  reader  is  likely  to  have  seen  Pew  and  Mavor's  (1998)  Modeling  Human  and 
Organizational  Behavior,  we  briefly  review  it  here  to  provide  baekground  for  readers  not 
familiar  with  it  and  to  provide  some  useful  eontext.  In  their  book,  Pew  and  Mavor  review 
the  state  of  the  art  in  human-behavior  representation  as  applied  to  military  simulations,  with 
an  emphasis  on  eognitivc,  team,  and  organizational  behavior.  Their  book  is  based  on  a  panel 
that  met  for  18  months  and  drew  extensively  on  a  wide  range  of  researehers.  It  is  available 
as  a  hardeopy  book,  as  well  as  online  (books.nap.edu/eatalog/61 73.html).  — 

Pew  and  Mavor  look  not  just  at  representing  behavior,  but  also  at  methods  for 
generating  behavior.  Tliey  provide  a  review  of  the  uses  of  models  of  behavior  in  synthetie 
environments.  They  inelude  a  review  of  the  major  synthetie  environments  in  use  by  the  U.S. 
military.  These  environments  are  examples  of  the  range  of  eurrent  and  potential  uses  and 
levels  of  simulation. 

Their  book  provides  a  useful  summary  of  integrated  (eognitive)  arehiteelures.  It  is 
eomprehensive  and  elear  enough  that  we  have  used  it  to  teaeh  undergraduate  students.  Their 
summary  ineludes  a  table  eomparing  the  arehiteetures.  We  will  apply  the  same  table  to 
review  several  additional  arehiteetures. 

Their  book  also  reviews  the  important  areas  to  modeling  human  behavior  in  synthetie 
environments.  This  is  a  very  wide  range,  eneompassing  nearly  all  of  human  behavior.  Their 
book  reviews  attention  and  multi-tasking,  memory  and  learning,  human  deeision  making, 
situation  awareness,  planning,  behavior  moderators  (sueh  as  fatigue  and  emotions), 
organizational  (small  group)  behavior,  and  information  warfare  (e.g.,  how  the  order  of 
information  presentation  influenees  deeision  making).  Their  book  eoneludes  with  a 
framework  for  developing  models  of  human  behavior  followed  by  eonelusions  and 
reeommendations.  Eaeh  of  these  reviews  is  elearly  written  and  limited  only  by  the  spaee  it  is 
allowed.  The  reviews  are  quite  positive,  suggesting  that  major  aspeets  of  behavior  are  either 
already  being  modeled,  or  ean  and  will  be  modeled  within  a  few  years.  This  positive  tone  is 
in  stark  eontrast  to  a  similar  review  a  deeade  earlier,  whieh  eould  only  note  open  questions 
(Elkind,  Card,  Hoehberg,  &  Huey,  1990). 

1.4  What  Modeling  Human  and  Organizational  Behavior  Does  Well 

Pew  and  Mavor’s  book  is  a  useful  and  seminal  book  for  psyehology  and  modeling. 
Their  book  is  useful  beeause  the  reviews  it  provides,  while  they  eould  be  extended,  are 
unusually  elear  and  eomprehensive,  eovering  the  full  range  of  relevant  behavior.  It  eould 
serve  as  a  useful  textbook  for  professionals  in  other  areas  to  teaeh  them  eurrent  results  and 
problems  in  the  areas  of  psyehology  and  modeling. 

Their  book  is  seminal  beeause  the  authors  lay  out  a  eomplete  review  of  eognition  that  is 
widely  usable.  While  their  review  is  similar  to  Newell’s  (1990)  and  Anderson  and  Lebiere’s 
(1998)  reviews.  Pew  and  Mavor’s  review  is  not  situated  within  a  single  arehiteeture;  the 
result  is  a  more  global  and  only  slightly  less-direeted  view. 
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The  reviews  of  the  models  and  data  to  be  modeled  together,  because  of  their  scope  and 
potential  impact,  constitute  a  call  to  arms  for  modelers  of  synthetic  forces.  Tlie  juxtaposition 
of  the  data  and  ways  to  model  them  is  entieing  and  cxeiting.  This  approach  of  modeling 
behavior  will  significantly  influence  psychology  in  general  if  the  modeling  work  continues 
to  be  successful.  Models  of  synthetic  forces  in  the  near  future  will  subsume  enough  general 
psychology  data  that  they  will  simply  represent  the  best  models  in  psychology. 

1.5  Where  Modeling  Human  and  Organizational  Behavior  Can  Be  Improved 

There  are  surprisingly  few  problems  with  Pew  and  Mavor’s  review.  However,  they  do 
not  review  all  of  the  possible  regularities  of  human  behavior.  We  will  add  a  few  additional 
important  regularities  and  provide  further  arguments  to  support  many  of  their  main 
conclusions.  They  could  have  referenced,  for  example,  the  Handbook  of  Perception  and 
Human  Performance  (Boff,  Kaufman,  &  Thomas,  1986)  and  the  Engineering  Data 
Compendium  (Boff  &  Lincoln,  1988)  for  a  wide-ranging  list  of  existing  general  regularities 
in  perception  and  performance  (the  latter  reference  has  also  been  put  into  a  CD-Rom  version 
as  well,  see  iac.dtic.mil/hsiac/products/cashe/cashe.html).  In  the  area  of  human  decision 
making,  Dawes’  (1994)  review  is  also  valuable.  Pew  and  Mavor  do  not  eite  a  quite  relevant 
report  on  how  this  type  of  modeling  is  also  being  developed  as  entertainment  (Computer 
Science  and  Telecommunications  Board,  1997),  and,  not  surprisingly,  they  could  not  rcpK>rt 
a  concurrent  similar  United  Kingdom  review  (Synthetie  Environments  Management  Board, 
1998). 

On  a  high  level  and  early  on,  they  explicitly  note  that  they  will  not  review  the  usability 
of  behavioral  models.  We  will  argue  that  improved  usability  is  necessary  for  these  models  to 
aehieve  their  potential. 

They  do  not  have  the  space  to  review  all  the  integrative  (cognitive)  architectures.  While 
it  would  be  unfair  to  call  this  book  dated  at  this  point  in  time,  there  are  already  a  few 
architectures  worth  considering  that  were  not  available  to  them. 

They  do  not  dwell  on  the  ability  to  deseribe  human  behavior,  instead  they  foeus  on  how 
to  generate  it.  There  remains  some  need  to  be  able  to  deseribe  the  behavior  before 
generating  it,  which  we  will  take  up  below. 

Finally,  they  did  not  have  the  space  to  lay  out  very  detailed  projects  to  fulfill  their 
short-,  medium-,  and  long-term  goals.  We  provide  a  more  detailed,  but  still  ineomplete,  set, 

1.6  Structure  of  This  Report 

Chapter  2  provides  amplifieations,  updates,  and  additions  to  Pew  and  Mavor’s  list  of 
psyehological  regularities  that  should  be  included  in  models  of  human  behavior.  Chapter  3 
notes  problems  integrating  models  with  simulations  as  well  as  problems  integrating  them 
with  each  other  to  make  larger,  more  complete  models.  Chapter  4  takes  up  the  issues 
surrounding  usability  of  behavioral  models.  Usability  of  the  models  themselves  was 
considered  to  be  outside  the  scope  of  Pew  and  Mavor’s  report  (1998,  p,  10).  We  will  argue 
that  improving  the  usability  of  these  models  by  their  creators  and  other  analysts  is  not  only 
desirable,  but  necessary  for  the  success  of  modeling  itself.  Chapter  5  considers  new 
techniques  and  cognitive  architectures  for  modeling  human  behavior  in  synthetic 
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environments  with  respeet  to  the  aims  of  the  previous  two  ehapters.  Chapter  6  eoneludes 
with  a  list  of  projects  to  address  problems  identified  in  Chapters  2,  3,  and  4  based  on  the 
techniques  and  architectures  in  Chapter  5. 
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Current  Objective:  More  Complete  Performance 


There  are  a  wide  range  of  behaviors  that  have  yet  to  be  incorporated  into  existing 
models.  Included  in  this  list  are  numerous  additional  relevant  regularities  about  human 
behavior  (see  Boff  &  Lincoln,  1988,  for  a  subset).  The  question  that  must  be  addressed  is: 
which  behaviors  are  the  most  important  and  most  accessible  to  incorporate?  We  note  here 
several  of  the  most  promising  or  necessary  behaviors  to  be  included  next  in  models  of 
human  performance,  based  on  our  experiences  and  previous  work. 

The  suggestions  we  make  later  tend  to  be  based  on  modeling  the  individual.  Much  of 
the  behavior  being  modeled  currently  in  synthetic  environments  is  different  because  it  needs 
to  include  small  and  large  groups  and  is  aggregated  across  time  or  situations.  As  smaller 
time  scales  and  more  intricate  and  fine-grained  simulations  are  developed  and  used,  such  as 
for  modeling  urban  terrorism,  the  behavioral  issues  noted  here  will  become  more  important. 

We  start  with  learning.  While  Pew  and  Mavor  include  learning  as  a  useful  aspect  of 
performance,  we  believe  learning  to  be  essential.  We  also  expand  the  ease  for  including 
models  of  working  memory,  perception,  emotions  and  behavioral  moderators,  and  erroneous 
behavior.  We  then  can  examine  higher-level  aspects  of  behavior  to  be  considered,  starting 
with  integration  of  models  and  ending  with  information  overload. 

2.1  Learning 

Learning  is  mentioned  as  important  in  several  ways  by  Pew  and  Mavor  (1 998). 
Learning  (i.c.,  training)  is  the  largest  role  of  the  military  in  peace  time  (i.c.,  rehearsal,  p.  30), 
essential  for  multi-tasking  behavior  (pp.  1 14-1 15),  an  important  aspect  of  human  behavior 
(ehap.  5),  and  important  within  groups  (chap.  lO).  We  cover  learning  again  here. 

Pew  and  Mavor  mention  several  of  the  advantages  of  learning.  TTierc  arc  several 
additional  advantages  that  we  can  emphasize.  Tactics  are  influenced  by  learning.  In 
addition,  there  is  a  home-field  advantage:  working  within  your  own  territory,  because  you 
know  it,  makes  additional  tactics  feasible  and  provides  generally  improved  performance. 
(Working  within  your  own  territory  would  also  provide  some  additional  motivation.) 

Including  learning  in  models  would  provide  a  mechanism  for  producing  difTerent  levels 
of  behavior.  Experienced  troops,  for  example,  would  be  different  not  in  some  numeric  way 
in  that  they  react  faster  (although  this  is  probably  true),  but  in  a  more  qualitative  way  in  that 
they  know  more  and  use  different  strategics.  Learning  modifies,  constrains,  and  supports  the 
use  of  computer  interfaces  (Rieman,  Young,  &  Howes,  1996);  similar  effects  may  be  found 
in  exploring  terrain  and  implementing  tactics  in  new  geographic  spaces. 

Programming  that  is,  creating  the  model  directly  may  be  too  difficult.  It  may  be 
easier  for  models  to  learn  behaviors  than  for  these  behaviors  to  be  programmed  directly. 
This  argument  has  been  put  forward  by  eonnectionist  researchers  for  some  time. 


Human  Systems  lAC  SOAR,  2003 


9 


Modeling  Human  Performance 


Theoretical  work  in  this  area  of  learning  has  direct  implications  for  training  within  the 
military  and  within  schools.  Models  that  learn  can  be  used  to  understand  and  optimize 
learning  (Ohlsson,  1992).  If  we  can  program  models  to  learn,  the  behavior  and  knowledge 
that  result  may  be  different  from  the  initial  knowledge  that  the  system  started  with  or  from 
the  expert  performance  that  we  currently  teach.  This  final  knowledge  may  be  useful  for 
teaching.  In  the  case  of  photocopying  (Agre  &  Shrager,  1990),  for  example,  better  strategies 
arise  through  practice  but  are  not  valuable  enough  to  teach.  In  military  domains,  it  may  be 
useful  to  find  and  then  to  teach  the  improved  strategies  that  may  arise  from  grossly  extended 
practice,  that  is,  tactics  that  are  better  but  that  no  person  has  had  enough  practice  to  learn 
before.  At  that  point,  explanation  of  behavior  will  also  become  important  to  understand  why 
the  new  behavior  is  useful  so  that  it  is  trusted. 

2.2  Expertise 

Expert  behavior  has  an  important  role  to  play  in  models  of  human  performance 
(Shadbolt  &  O’Hara,  1997).  One  of  the  Western  powers’  greatest  strengths  is  training  in 
depth  and  breadth.  Practice  influences  speed  of  processing  and  error  rates,  particularly  under 
stress.  If  synthetie  forces  are  to  be  used  to  test  doctrine,  the  effect  of  training  on  expertise 
must  be  included. 

Expert  behavior  has  been  studied  extensively  in  recent  years  and  a  great  deal  is  known 
about  it  (Chipman  &  Meyrowitz,  1993;  Ericsson  &  Kintsch,  1995;  Gobet,  1998;  Gobet  & 
Simon,  2000;  Hoffman,  Crandall,  &  Shadbolt,  1998).  Some  essential  characteristics  of 
expertise  are  highly  developed  perception  for  the  domain  material,  selective  search  for 
solutions  in  that  domain,  and  a  good  memory  for  domain-related  material.  In  most  domains, 
problem-solving  behavior  (search)  differs  as  well:  novices  tend  to  search  backward  from  the 
situation  to  find  solutions  and  experts  tend  to  search  forward  from  the  situation  to  find 
solutions  (Larkin,  McDermott,  Simon,  &  Simon,  1980).  Finally,  transfer  of  expertise  to 
other  domains  is  limited. 

Klein  and  his  colleagues  (e.g.,  Klein,  1997)  have  studied  real-time  performance  in  real 
settings  (as  opposed  to  laboratory  settings)  in  detail,  and  have  essentially  found  that  the 
characteristics  mentioned  above  are  also  critical  in  these  situations.  A  number  of  rather 
extensive  reviews  have  been  undertaken  of  Klein’s  approach,  which  is  often  referred  to  as 
Naturalistic  Decision  Making  (NDM)  (e.g.,  Hoffman  &  Shadbolt,  1995).  A  method  to  elicit 
this  type  of  knowledge  has  been  developed  by  Klein  and  his  associates.  It  is  known  as  the 
Critical  Decision  Method  and  is  described  in  Hoffman  et  al.  (1998).  The  specifically  real¬ 
time  challenges  of  acquiring  knowledge  relating  to  perceptually  cue-rich  decision  making 
arc  discussed  in  a  second  Defence  Evaluation  and  Research  Agency  (DERA),  United 
Kingdom,  report  by  Hoffman  and  Shadbolt  (1996). 

Given  the  fact  that  it  takes  a  long  time  to  become  an  expert  the  rule  of  10  years  or 
10,000  hours  of  practice  and  study  is  often  mentioned  (e.g.,  Simon  &  Chase,  1973)^— the 
size  of  the  dataset  has  made  it  difficult  indeed  to  study  real-time  learning  on  the  road  to 
expertise.  However,  real-time  learning  in  simpler  problem-solving  tasks  has  been  studied 
and  modeling  accounts  have  been  provided  (Anzai  &  Simon,  1979;  John  &  Kieras,  1996; 
Nielsen  &  Kirsner,  1994;  Ritter  &  Bibby,  2001).  Some  of  these  results  may  apply  to  expert 
learning  in  more  complex  tasks  as  well. 
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While  experts  vastly  outperform  non-experts  in  most  domains,  exceptions  to  this  rule 
have  been  found  in  domains  such  as  clinical  diagnosis,  clinical  prediction,  personnel 
selection,  and  actuarial  predictions  (Dawes,  1988).  In  these  domains,  experts  pcrfomi  only 
slightly  better  than  non-experts,  and  typically  perform  worse  than  simple  statistical  methods, 
such  as  regression  analysis.  One  other  aspect  of  behavior  that  distinguishes  experts  from 
novices  is  the  ability  to  recover  from  errors.  An  important  question  is  to  which  category 
military  diagnosing  and  prediction  belong  because  of  the  uncertainties  involved?  And,  based 
on  this  answer,  what  can  be  done  (either  by  providing  fonnal  tools  or  by  improving  training) 
to  remedy  this  situation  and  assist  error  recovery? 

The  effect  of  learning  local  environments  and  strategies  (own  and  opponent’s)  must  also 
be  included.  Having  learned  the  local  terrain  probably  explains  much  of  the  home-field 
advantage.  How  docs  this  learning  occur? 

Within  the  sub-field  of  knowledge-engineering  there  have  been  considerable  efforts  to 
produce  methodologies  for  the  acquisition,  modeling  and  implementation  of  knowledge- 
intensive  tasks.  It  is  a  moot  point  whether  the  resulting  decision-support  systems  arc 
cognitively  plausible.  Nevertheless,  these  methodologies  now  provide  powerful  ways  of 
constructing  complex  systems  that  exhibit  task-oriented  behavior.  To  this  end,  anyone 
engaged  in  engineering  large-scale  synthetic  environments  should  look  at  the  principles  laid 
down  in  the  most  recent  of  this  work.  The  most  accessible  source  is  probably  Schreiber  ct  al. 
(2000). 

2.3  Working  Memory 

Central  to  all  questions  about  human  cognition  and  performance  is  the  role  of  working 
memory.  Working  memory  is  implicated  in  almost  all  aspects  of  cognitive  performance 
(Boff  &  Lincoln,  1986,  See.  7;  Just  &  Carpenter,  1992;  Newell  &  Simon,  1972;  Wickens, 
1992).  It  is  widely  agreed  that  limitations  of  working  memory  are  a  major  determinant  of 
limitations  of  cognitive  performance.  Definitions  of  working  memory  are  varied  but  for 
present  purposes  we  can  take  it  to  refer  to  the  mechanisms  that  maintain  and  provide  access 
to  information  created  or  retrieved  during  the  perfoniiance  of  a  task. 

Modem  approaches  to  the  psychological  study  of  human  working  memory  often  take  as 
their  starting  point  the  famous  paper  by  Miller  (1956)  and  argue  that  people  can  retain  only 
around  “7  +/-  2”  items  in  short-term  memory.  Later  work  has  tended  to  revise  that  estimate 
downwards,  towards  three  to  four  items  of  unrelated  information  (Crowder,  1976;  Simon, 
1974). 

A  more  recent  and  influential  line  of  work  by  Baddeley  (1986,  1997)  presents  working 
memory  as  a  dual  system  for  the  rehearsal  of  infonnation,  consisting  of  (1)  a  phonological 
loop,  that  contains  approximately  2  seconds  of  verbalizations,  for  the  rehearsal  of 
phonological,  acoustic,  or  articulatory  information  (e.g.,  useful  for  repeating  a  phone 
number  until  you  dial  it);  and  (2)  a  visual-spatial  scratchpad,  with  a  smaller  and  Icss- 
determined  capacity  (c.g.,  useful  when  searching  for  an  object  that  you  have  just  seen),  to 
play  an  analogous  role  for  the  maintenance  of  pictorial  and  spatial  infomiation. 
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Other  approaches  within  experimental  psychology  place  more  emphasis  on  the  role  of 
working  memory  in  both  storing  and  manipulating  temporary  information  (Daneman  & 
Carpenter,  1980;  Just  &  Carpenter,  1992).  An  important  recent  extension  to  the  notion  of 
working  memory  comes  from  the  study  of  expertise,  where  Ericsson  and  Kintsch  (1995) 
argue  that  after  extensive  practice  in  a  particular  domain  people  can,  through  specialized 
retrieval  structures,  use  long-term  memory  for  the  rapid  storage  of  temporary  information 
(i.e.,  long-tenn  working  memory). 

A  recent  book  (Miyake  &  Shah,  1999)  reviews  a  range  of  current  approaches  to  the 
modeling  of  working  memory,  although  many  of  the  models  do  not  have  the  explicitness 
and  generality  needed  to  support  the  simulation  of  human  performance  in  complex  tasks.  Of 
those  that  do,  their  view  of  working  memory  varies  widely.  Some,  such  as  ACT-R 
(Anderson  &  Lebiere,  1998)  and  CAPS  (Just  &  Carpenter,  1992),  consider  working  memory 
not  as  a  separate  structural  entity  but  rather  as  an  activated  region  of  a  larger,  more  general 
memory  system,  in  which  the  limitations  of  working  memory  derive  from  a  limited  total 
quantity  of  activation.  Just  and  Carpenter  (1992),  and  more  recently  ACT-R  models,  have 
extended  that  view  to  the  modeling  of  individual  differences  in  working  memory  where 
different  people  are  assumed  to  have  different  maximum  quantities  of  available  activation 
(Daily,  Lovett,  &  Reder,  2001;  Lovett,  Daily,  &  Reder,  2000).  A  number  of  these  ideas  are 
put  together  by  Byrne  and  Bovair  (1997)  who  modeled  (in  CAPS)  the  way  that  a  class  of 
performance  errors,  in  which  people  forget  to  complete  subsidiary  aspects  of  a  task  (such  as 
removing  the  original  from  a  photocopier),  is  affected  by  working  memory  load. 

In  contrast  to  these  resource-limited  models.  Soar  (Laird,  Newell,  &  Rosenbloom, 
1987;  Newell,  1990)  imposes  no  structural  limitation  on  working  memory.  Using  Soar, 
Young  and  Lewis  (1999)  explore  the  possibilities  of  working  memory  being  constrained 
not  by  physical  resources  but  by  functional  limitations  and  by  specific  kinds  of  similarity- 
based  interference. 

In  summary,  the  current  position  is  that  human  performance  is  known  to  be  highly 
dependent  on  working  memory  and  working  memory  load,  and  to  be  susceptible  to  factors 
such  as  individual  differences  (Just  &  Carpenter,  1992),  distractions  (Byrne  &  Bovair, 

1997) ,  emotion  and  stress  (Boff  &  Lincoln,  1988),  and  expertise  (Ericsson  &  Kintsch, 
1995).  Many  existing  models  of  human  performance  (e.g.,  as  reviewed  in  Pew  &  Mavor, 

1998)  do  not  directly  model  the  role  of  working  memory.  Models  exist  (Miyake  &  Shah, 

1999) ,  and  some  approaches  to  cognitive  modeling  (ACT-R,  CAPS,  Soar)  have  potential  for 
improving  predictions  of  human  performance  in  realistic  task  situations  by  including  more 
accurate  theories  of  memory.  There  remains  a  need  for  the  investigation  and  development  of 
more  explicit  and  complete  models,  with  broader  scope,  of  the  role  of  working  memory  in 
human  performance. 

2.4  Emotions  and  Behavioral  Moderators 

Emotion,  affect,  motivation,  and  other  behavioral  moderators  are  increasingly  being 
seen  as  factors  that  can  and  often  do  influence  cognition.  This  view  has  received  attention 
among  a  range  of  computer  scientists  and  psychologists.  Pew  and  Mavor  (1998,  chap.  9)  lay 
out  an  initial  case  for  including  emotion  as  an  internal  moderator  of  behavior.  The  British 
HCI  Group  sponsored  a  one-day  meeting  on  "'Affective  Computing:  The  Role  of  Emotion  in 
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Human  Computer  Interaction”  that  attracted  70  people  to  University  College,  London 
(Monk,  Sassc,  &  Crerar,  1999).  Picard’s  (1997)  recent  book  provides  a  useful  review  of 
emotions  and  computation  in  general.  Sloman’s  (1999)  review  of  the  book  and  Picard’s 
(1999)  response  arc  useful  summaries.  A  further  case  is  also  made  in  the  section  on  the 
Sim  Agent  Toolkit.  We  present  here  an  additional  argument  for  including  a  model  of 
emotions  and  behavioral  moderators  in  models  of  synthetic  forces,  note  two  potential 
problems  with  existing  models,  and  sketch  an  initial  theory. 

2.4.1  Further  Uses  of  Emotions  and  Behavioral  Moderators 

Models  of  emotions  and  behavioral  moderators  may  be  necessary  for  modeling  non- 
doctrinal  performance  such  as  insubordination,  fatigue,  errors,  and  mistakes.  Many  authors 
have  also  noted  the  role  of  emotion  in  fast,  reactive  systems  (Picard,  1997,  provides  a  useful 
overview).  Individual  differences  in  emotions  may  be  related  to  personality  and  differences 
in  problem  solving.  That  is,  the  range  of  emotions  may  be  best  explained  as  an  interaction 
that  arises  between  task  performance  and  situation  assessment  and  an  agent’s  likes,  desires, 
and  personal  cognitive  style.  An  argument  is  starting  to  be  put  forward  that  changes  in 
motivation  based  on  temporally  local  measures  of  success  and  failure  may  help  problem 
solving  (Belavkin,  2001;  Belavkin  &  Ritter,  2000;  Bclavkin,  Ritter,  &  Blliman,  1999). 

2.4.2  Working  Within  a  Cognitive  Architecture 

Emotions  arise  from  structures  related  to  cognition  and  should  be  closely  related  to  and 
based  on  cognitive  structures.  All  of  the  arguments  for  creating  a  unified  theory  of  cognition 
(Anderson,  Matessa,  &  Lebiere,  1998;  Newell,  1990)  also  apply  to  creating  a  unified  theory 
of  emotion  as  well.  The  effects  of  emotions  and  other  behavioral  moderators  on  cognition 
arc  presumably  not  task-specific,  so  their  implementation  belongs  in  the  architecture,  not  in 
the  task  knowledge. 

Theories  of  emotions  should  thus  be  implemented  within  a  cognitive  architecture.  This 
will  allow  them  to  realize  all  the  advantages  of  being  within  a  cognitive  architecture, 
including  being  reusable  and  being  compared  to  and  incorporated  within  other  models. 
Some  models  of  emotions  have  been  built  within  a  cognitive  architecture  (Bartl  &  Domer, 
1998;  Belavkin,  Ritter,  &  Elliman,  1999;  Franceschini,  McBride,  &  Sheldon,  2001;  Gratch 
&  Marsella,  2001;  R.  Jones,  1998;  Rosenbloom,  1998).  Being  created  within  an 
infonnation-processing  model  has  required  them  to  be  more  specified  than  previous 
theories.  Being  part  of  a  model  that  performs  the  task  has  also  allowed  them  to  make 
more  predictions. 

2.4.3  A  Sketch  of  a  Computational  Theory  of  Emotions 

An  important  aspect  of  cognition  is  to  process  sensory  infonnation,  assign  meaning  to  it, 
and  then  decide  upon  a  plan  of  action  in  response.  This  is  a  real-time  process  in  which  new 
sensory  information  arrives  continuously.  This  view  is  similar  to  the  view  put  forward  by 
Agre  and  Chapman  (1987)  about  representationless  thinking.  The  plan  must  therefore  be 
dynamically  reconfigurablc  and  will  often  be  abandoned  in  favor  of  a  better  plan  midway 
through  its  execution.  Elliman  has  a  speculative  view  of  the  role  of  emotions  in  cognition. 
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similar  to  Rasmussen’s  (1998)  stepladdcr  framework  of  behavior,  which  makes  the 
following  assumptions: 

•  The  amount  of  sensory  data  available  at  any  moment  is  too  large  for  attention  to  be 
given  to  more  than  a  small  fraction  of  the  data. 

•  The  conscious  consideration  of  the  results  of  perception  is  an  expensive  process  in 
terms  of  the  load  on  neural  hardware  and  also  time-consuming. 

•  Most  sensory  processing  is  unconscious  in  its  early  stages  in  order  that  expensive 
conscious  processes  need  consider  only  the  results  of  perception.  These  results 
might  include  labeled  objects  with  a  position  in  space,  for  example  “a  tank  moving 
its  turret  in  that  clump  of  trees.”  Conscious  processes  might  well  add  further  detail 
such  as  the  type  of  tank  and  the  range  of  its  gun. 

•  Attentional  mechanisms  are  needed  to  direct  the  limited  high-level  processing  to  the 
most  interesting  objects.  These  may  be  novel,  brightly  colored,  fast-moving,  or 
potentially  threatening. 

•  Planning  is  an  especially  heavy  computational  process  for  the  human  mind  and  one 
that  is  difficult  to  carry  out  effectively  under  combat  conditions.  (Perhaps  the  best 
way  to  explain  why  military  doctrine  is  useful  is  that  it  distills  the  best  generic 
practice  and  trains  the  soldier  to  behave  in  a  way  that  might  well  have  been  a  chosen 
and  planned  behavior  if  the  individual  had  the  time  and  skill  to  formulate  the  action 
himself  The  danger  is  that  no  doctrine  can  envisage  all  scenarios  in  advance  and,  on 
occasion,  the  use  of  doctrine  in  a  rigid  manner  may  be  harmful.) 

•  From  an  evolutionary  perspective  this  system  of  unconscious  processing  of  sensory 
input,  attentional  mechanisms,  and  cognitive  planning  (together  with  speech-based 
communication)  is  a  masterstroke  of  competence  for  survival.  However,  it  has  one 
crippling  disadvantage — it  is  too  slow  to  react  to  immediate  and  sudden  attack. 

Rapid  reaction  to  possible  threat  without  the  time  for  much  cognitive  processing  is 
clearly  of  huge  value.  In  this  framework  emotion  can  be  seen  as  kind  of  labeling  process  for 
sensory  input.  Fear  particularly  fits  this  pattern  and  is  a  label  that  causes  selected  sensory 
input  to  literally  scream  for  attention.  For  this  process  to  work  rapidly  it  needs  to  be 
hardwired  differently  than  higher-level  cognitive  processes.  There  is  strong  evidence  that 
the  amygdala  is  intimately  involved  in  the  perception  of  threat  and  able  to  trigger  the 
familiar  sensation  of  fear  (c.g.,  Whalen,  1999).  If  this  organ  of  the  brain  is  damaged, 
individuals  may  find  everyday  events  terrifying  while  not  perceiving  any  need  for  alarm  in 
life-threatening  situations. 

This  rapid,  emotive  response  to  sensory  data  is  relatively  crude  and  prone  to  false 
alarms.  Reactive  behavior  is  triggered  that  may  be  involuntary,  for  example,  the  startle 
reaction  and  physiological  changes  due  to  the  release  of  noradrenalin.  After  the  reaction 
response,  it  takes  time  for  cognitive  processes  to  catch  up  and  make  a  more  informed 
assessment  of  the  situation  and  actual  threat.  If  this  emotive,  reactive  stimulation  is  excited 
in  a  chronic  manner  then  susceptible  individuals  may  become  less  effective,  with  impaired 
ability  to  think  and  plan  clearly.  Any  kind  of  anxiety  is  a  form  of  stress.  Because  individuals 
have  a  finite  capacity  for  absorbing  it,  excessive  stress  results  in  fatigue. 
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2.5  Errors 

Ideally,  military  behavior  is  iiomialive,  that  is,  what  is  done  is  what  should  have  been 
done.  Human  behavior  does  not  always  mateh  the  nonnalive  ideal  of  nhlilary  behaviors. 
One  of  the  most  important  aspeels  of  human  perfomianee,  whieh  has  often  been  overlooked 
in  models  of  behavior  and  problem  solving,  is  errors  (although  see,  for  example,  Caeeiabue, 
Deeortis,  Drozdowiez,  Masson,  &  Nordvik,  1992;  Freed  &  Remington,  2000;  Freed,  Shafto, 
&  Remington,  1998).  There  is  a  eonsensus  building  about  the  definition  of  errors — for  most 
people  an  error  is  something  done  that  was  not  intended  by  the  actor,  that  was  not  desired, 
and  that  plaeed  the  lask/system  beyond  aeeeplable  limits  (e.g..  Senders  &  Moray,  1991). 

Part  of  the  reason  for  omitting  errors  from  models  of  behavior  is  the  fallaey  that  they  are 
produeed  by  some  speeial  error-generating  meehanism  that  ean  be  boiled  on  to  models  onee 
they  are  produeiiig  eorreet  behavior  on  the  task  at  hand.  Often,  however,  the  aetions  that 
preeede  errors  would  have  been  judged  to  be  eorreet  if  the  eireunislanees  had  been  slightly 
different.  In  other  words,  as  Maeh  (1905/1976)  observed,  knowledge  and  error  both  stem 
from  the  same  souree. 

Evidenee  shows  that  noviees  and  experieneed  personnel  will  often  make  the  same  errors 
when  exposed  to  the  same  eireumstanees.  The  difference  lies  in  the  ability  to  notiee  and 
reeover  from  these  errors.  Experieneed  personnel  are  more  sueeessful  at  mitigating  errors 
before  the  full  eonsequenees  arise.  In  other  words,  it  is  the  management  of  errors  that  is 
important  and  needs  to  be  trained  (Frese  &  Allmann,  1989),  rather  than  vainly  trying  to 
teaeh  people  how  to  prevent  the  inevitable. 

2.5.1  Training  About  Errors 

In  any  eomplex,  dynamie  environment,  sueh  as  a  military  battlefield,  the  eonsequenees 
of  uneorreeted  errors  are  potentially  disastrous.  While  noniially  a  siring  of  mistakes  is 
required  to  ereate  a  disaster,  the  rapid  paee  of  the  battlefield  and  adversaries  allows  single 
mistakes  to  beeome  more  ealastrophie. 

There  is,  therefore,  a  real  need  to  learn  how  to  manage  errors  in  an  environment  in 
whieh  the  eonsequenees  are  less  severe.  An  advantage  of  using  synthelie  environments  is 
that  eomparative  noviees  ean  experiment  in  unfamiliar  situations,  with  reslrielions 
approximating  the  real  environment  in  time,  space,  enemy  capabilities,  and  so  on,  but  with 
the  knowledge  that  the  eonsequenees  of  any  errors  ean  be  reeovered.  In  addition,  multiple 
seenarios  ean  be  played  out  over  a  eompressed  lime  period,  thereby  providing  the  noviee 
with  a  variety  of  experienees  that  would  take  many  years  to  aeeumulale  through  exposure  to 
situations  in  the  real  world.  This  ean  be  a  great  training  aid,  literally  giving  years  of 
experienee  in  far  less  lime.  When  noviees  were  trained  in  airerafl  eleetrieal-syslem 
troubleshooting  using  a  simulated  system,  they  were  able  to  aequire  years  of  experienee  in 
months  beeause  the  tutor  let  them  praetiee  just  their  diagnostie  skills  without  praetieing  their 
disassembly  skills  (Lesgold,  Lajoie,  Bunzon,  &  Eggan,  1 992). 
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2.5.2  Models  That  Make  Errors 

There  are  several  proeess  models  complete  enough  to  make  errors,  depending  to  some 
degree  on  the  definition  of  error.  Models  that  include  errorful  behavior  exist  in  EPAM 
(Feigenbaum  &  Simon,  1984;  Gobet  &  Simon,  2000),  ACT-R  (Anderson,  Farrell,  &  Sauers, 
1984;  Anderson  &  Lebiere,  1998;  Lebiere,  Anderson,  &  Rcdcr,  1994)  and  Soar  (Bass, 
Baxter,  &  Ritter,  1995;  Howes  &  Young,  1996;  Miller  &  Laird,  1996),  although  each 
generates  errors  in  different  ways  and  at  different  levels.  Fewer  models  exist  that  model 
error  recovery,  although  this  is  clearly  the  next  aspect  to  model. 

A  problem  with  models  and  humans  is  that  the  erroneous  behavior  is  often  task-speeific; 
given  a  new  task,  both  models  and  humans  might  not  generate  the  same  behavior.  In  other 
words,  the  erroneous  behavior  arises  as  a  result  of  the  combination  of  human,  technological, 
and  organizational  (environmental)  factors.  Vicente  (1998)  delineates  some  of  the  problems 
in  this  area. 

There  are  various  taxonomies  of  errors  that  could  be  incorporated  into  models  of 
performance.  There  are  also  other  constraints  that  reduce  the  level  of  performance  that  are 
worth  exploring,  including  working  memory  (Young  &  Lewis,  1999),  attention,  and 
processing  speed  due  to  expertise. 

2.6  Adversarial  Problem  Solving 

Adversarial  problem  solving  is  different  from  simple  problem  solving  and  makes 
additional  requirements  for  modeling  behavior  in  synthetic  environments.  Planning  is  not 
done  within  a  static  environment,  but  done  in  an  environment  with  active  adversaries. 

Research  on  adversarial  problem  solving  (e.g..  Chase  &  Simon,  1973;  de  Groot 
1946/1978;  Gobet  &  Simon,  2001;  Newell  &  Simon,  1972)  has  identified  several  aspects  of 
cognitive  behavior  that  have  been  shown  to  generalize  to  other  domains,  including  the 
military  domain  (Chamess,  1992).  A  key  result  is  that  players  do  not  follow  a  strategy  such 
as  minimax  but  that  they  satisfice  (Simon,  1955),  that  is,  they  satisfy  themselves  with  a 
good-enough  solution,  which  can  be  far  from  the  optimal  solution  (dc  Groot  &  Gobet,  1996; 
Gobet  &  Simon,  1996a).  This  satisficing  behavior  can  be  explained  by  the  processing  and 
capacity  limits  of  human  cognition,  such  as  the  time  to  learn  a  new  chunk  or  the  capacity  of 
short-term  memory  (Newell  &  Simon,  1972). 

A  second,  related  aspect  is  that  a  player’s  search  is  highly  selective:  only  a  few  branches 
of  the  search  tree  are  explored.  The  choice  of  subspaee  to  search  seems  to  be  constrained  by 
pattern-recognition  mechanisms  (Chase  &  Simon,  1973;  Gobet,  1998;  Gobet  &  Simon, 
1996a).  A  consequence  is  that  misleading  perceptual  cues  may  result  in  the  exploration  of 
an  incorrect  subspace.  For  example,  Saariluoma  (1990)  reported  that  chess  masters  found  a 
suboptimal  solution  when  the  features  of  the  position  led  them  to  look  for  a  standard, 
although  inferior,  subspace.  The  consequence  for  understanding  combatant  behavior  is  that 
pattern  recognition  may  influence  the  course  of  action  chosen  as  much  as  the  detail  of  the 
way  the  search  is  carried  out.  In  fact,  de  Groot  (1946/1978)  did  not  find  differences  in  the 
macrostrueture  of  search  of  chess  players  at  different  skill  levels. 

A  third  important  result  is  that  chess  players  re-investigate  the  same  sequence  of  actions 
several  times,  interrupted  or  not  by  the  analysis  of  other  sets  of  actions.  Dc  Groot  (1946)  has 
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called  this  phenomenon  progressive  deepening.  It  is  related  to  the  selective  search  shown  by 
experts  in  other  areas  (Chamess,  1991;  Ericsson  &  Kintsch,  1995;  Gobet  &  Simon,  1996a; 
Hoffman,  1992).  De  Groot  and  Gobet  (1996)  propose  that  progressive  deepening  is  due  both 
to  the  limits  of  human  cognition  (limited  capacity  of  short-lemi  memory,  slow  encoding 
time  in  long-term  memory)  and  that  with  this  searching  behavior,  information  gathered  at 
various  points  of  the  search  may  be  propagated  to  other  points,  including  previously  visited 
points  (this  eould  not  be  done  with  a  search  behavior  such  as  rninirnax). 

These  features  of  cognition,  identified  in  adversarial  problem  solving,  also  occur  in 
Rapid  Decision  Making  (RDM)  in  domains  such  as  firefighting,  combat,  and  chess  players 
in  time-trouble.  Interestingly,  the  model  developed  by  Klein  and  his  colleagues  (sec  Klein, 
1997,  for  a  review)  singles  out  the  same  features  as  the  model  developed  by  Chase  and 
Simon  (1973)  to  explain  expert  ehess-playing:  pattern  recognition,  selective  search,  and 
satisficing  behavior. 

While  some  aspects  of  adversarial  problem  solving  are  well  understood,  others  have  yet 
to  be  studied  in  any  depth.  Such  aspects  include  the  way  the  function  used  to  evaluate  the 
goodness  of  a  state  (the  evaluation  function)  changes  as  a  function  of  time,  the  link  between 
the  evaluation  function  and  pattern  recognition,  or  the  learning  of  domain-specific 
heuristics,  which  all  have  direct  implications  for  combat  behavior. 

Relatively  little  research  has  been  done  on  how  players  take  advantage  of  the  thinking 
particularities  of  their  opponent,  in  particular,  by  trying  to  outguess  him  or  her.  Jansen 
(1992)  offers  interesting  results.  He  has  developed  a  computer  program  that  takes  advantage 
of  some  features  and  heuristics  of  human  cognition  in  simple  chess  endgames,  such  as  the 
tendency,  in  human  players’  search,  to  avoid  moves  that  lead  to  positions  with  a  high- 
branching  factor,  and  to  prefer  moves  that  lead  to  forced  replies.  Using  these  features  and 
incorporating  them  in  its  evaluation  function,  the  program  was  able  to  win  faster  (in  won 
positions)  or  to  avoid  defeat  (in  lost  positions)  more  often  against  human  players  than  by 
using  a  standard  alpha-beta  search.  In  principle,  such  an  approach  eould  be  extended  to 
include  both  skill-related  and  individual  differences  in  synthetic  environments. 

In  comparison  to  perception  and  memory  in  games,  relatively  little  computer  modeling 
of  human  behavior  has  been  done  with  adversarial  problem  solving  (if  one  excludes  pure 
Artificial  Intelligence  [AI]  research,  in  which  adversarial  problem  solving  has  been  a 
favorite  subject  of  research).  One  may  mention  the  previous  work  of  Simon  and  colleagues 
(Baylor  &  Simon,  1966;  Newell,  Shaw,  &  Simon,  1958),  and  the  programs  of  Pitrat  (1977), 
Wilkins  (1980),  and  Gobet  and  Jansen  (1994).  All  of  these  programs  were  created  for  chess 
and  most  cover  only  a  subset  of  the  game. 

There  are  implications  of  adversarial  search  variation  for  perfonnance  (i.e.,  how  well  a 
planner  models  an  opponent).  This  would  be  a  natural  place  to  model  various  levels  of 
experience  in  opponents. 

2.7  Variance  in  Behavior 

Including  more  variety  in  how  a  model  performs  a  task  is  one  of  the  next  steps  for 
improving  the  realism  of  synthetic  forces.  Currently,  many  models  will  execute  a  task  the 
same  way  every  time  and  for  every  equivalent  agent.  In  the  real  world,  this  is  not  the  ease. 
The  choice  of  strategies  and  the  ordering  of  substrategies  will  vary  across  agents  and  vary 
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for  a  given  agent  across  time.  This  lack  of  variance  makes  adversaries  and  allies  too 
predictable  in  that  they  always  do  the  same  thing. 

Including  variance  in  behavior  is  also  necessary  when  behavior  is  less  predictable. 
Novices,  with  less  knowledge,  have  greater  variance  in  behavior  (Rauterberg,  1993).  In  the 
past,  variance  was  intentionally  suppressed  in  simulations  because  it  was  thought  that 
variance  in  real  behavior  was  suppressed  tlirough  doctrine  and  training.  Accounting  for 
variety  in  behavior  is  of  increasing  importance  when  modeling  less-prepared  and  less- 
trained  forces,  and  now  for  improving  model  accuracy  as  variance  in  real  behavior 
is  admitted. 

Variance  in  behavior  is  also  important  when  modeling  non-combatant  agents,  such  as 
white  forces  and  civilians.  These  agents  may  be  producing  their  behaviors  deterministically, 
but  the  determiners  are  often  hidden  from  other  agents,  making  them  appear  relatively 
unpredictable.  Finally,  the  ability  to  model  a  variety  of  behaviors  is  necessary  for 
sensitivity  analysis. 

Variance  will  arise  out  of  several  factors.  It  may  arise  from  different  levels  of  expertise, 
which  is  covered  above.  It  may  arise  from  different  strategics,  which  will  require  including 
multiple  strategies  and  noting  where  orders  are  less  likely  to  be  followed  and  when  panic 
results  in  orders  being  ignored.  Variance  may  also  arise  as  a  type  of  error,  such  as  applying 
a  right  action  in  the  wrong  circumstances. 

In  any  case,  variance  in  agent  behavior  in  synthetic  environments  particularly  needs  to 
be  included  in  training  materials.  Humans  are  very  good  pattem-recognizers^ — although  they 
do  not  always  look  for  or  know  the  right  pattcm~and  will  take  advantage  of  models  that  do 
not  vary  their  behavior.  The  real  opponents  may  not  be  so  predictable. 

2.8  Information  Overload 

Problems  with  information  overload  have  been  noted  numerous  times  (e.g..  Woods, 
Patterson,  Roth,  &  Christoffersen,  1999).  Hoffman  and  Shadbolt  (1996)  provide  a  review  of 
work  on  information  overload  in  real-time,  high-workload  military  contexts.  They  also 
discuss  challenges  that  information  overload  raises  for  knowledge  acquisition  in  the  context 
of  synthetic  forces  environments. 

Problems  resolving  clutter,  workload  bottlenecks,  and  finding  significance  in  incoming 
data,  arc  not  yet  problems  for  many  models  of  human  performance.  Currently,  most 
cognitive  and  synthetic  force  models  do  not  face  information  overload.  The  situation  has 
more  typically  been  of  a  model  seeing  only  a  limited  set  of  information  and  knowing  how  to 
perform  only  one  or  a  few  tasks. 

In  the  near  future,  the  models  will  have  more  complex  simulated  eyes  as  well  as  more 
knowledge  to  interpret  the  eyes’  input.  This  will  lead  to  more  incoming  information  with  a 
more  difficult  problem  of  deciding  which  objective  to  pursue  next  and  how  to  choose  the 
best  strategy  based  on  a  larger  set  of  knowledge  and  perceptual  inputs.  We  will  also  find 
that  models  will  start  to  have  trouble  with  information  overload,  clutter,  and  situation 
assessment.  Their  tactics  in  this  area  will  be  particularly  important  when  there  are  time 
pressures,  which  are  common  in  synthetic  environments  and  the  worlds  they  model. 
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Current  Objective:  Better  Integration 

There  are  theoretical  and  praetieal  problems  integrating  models  with  simulations  and 
with  other  models.  The  problems  ean  appear  to  be  simply  software  issues,  but  deeper 
theoretieal  issues  often  go  hand-in-hand  with  these  problems.  We  thus  note  a  few  of  these 
problems  in  getting  models  to  interaet  with  simulations  as  well  the  basie  problem  of 
aggregating  models. 

3.1  Perception 

At  least  sinee  de  Groot’s  early  work  (1946),  pereeption  has  been  deemed  to  play  an 
essential  role  in  eognition.  Neisser  (1976,  p.  9)  aptly  summarizes  it  as  “pereeption  is  where 
eognition  and  reality  meet.”  This  point  of  view  has  been  buttressed  in  recent  years  with  the 
emphasis  given  by  Nouvelle  AI  (e.g..  Brooks,  1992),  which  is  based  on  reactive 
architectures,  perceptual  mechanisms,  and  on  their  coupling  with  motor  behavior. 
Neuroseienee  (e.g.,  Kosslyn  &  Koenig,  1992)  teaches  that,  due  to  evolutionary  pressure,  a 
large  part  of  the  brain  deals  with  pereeption  (mainly  vision);  hence,  an  understanding  of 
pereeption  is  essential  for  understanding  the  behavior  of  combatants. 

Pereeption-based  behavior  offers  a  series  of  advantages:  it  is  fast,  attuned  to  the 
environment,  and  optimized  with  respect  to  its  coupling  with  motor  behavior.  However,  its 
disadvantages  include  its  tendency  to  be  stereotyped  and  to  lack  generalization.  In  addition, 
from  the  point  of  view  of  the  modeler,  it  is  a  difficult  behavior  to  simulate  well.  This  is  in 
part  due  to  the  fact  that  low-level  pereeption  is  still  poorly  understood  (Kosslyn  &  Koenig, 
1 992),  although  recent  progress  in  robotics  and  agent  behavior  give  examples  of  successful 
implementation  of  basie  perceptual  mechanisms  for  use  by  eognition  (e.g.,  Fkooks,  1992; 
Zettlemoyer  &  St.  Amant,  1999;  and  St.  Amant  &  Riedl,  2001). 

Pereeption  may  be  seen  as  the  common  ground  where  various  aspects  of  eognition  meet, 
including  motor  behavior,  concept  formation  and  categorization,  problem  solving,  memory, 
and  emotions.  In  several  of  these  domains,  computer  simulations  illustrating  the  role  of 
pereeption  have  been  developed. 

Brooks  (1992)  and  others  have  investigated  the  role  of  pereeption  in  motor  behavior 
with  simple  inseet-like  robots.  The  link  between  concept  formation  and  (high-level) 
pereeption  has  been  studied  using  the  EPAM  architecture  (Gobet,  Riehman,  Staszewski,  & 
Simon,  1997).  The  role  of  pereeption  in  problem  solving  has  been  studied  using  Chunk 
Hierarchy  and  REtrieval  Structures  (CHREST),  a  variation  of  EPAM  (Gobet,  1997;  Gobet 
&  Jansen,  1994)  that  also  accounts  for  multiple  memory  regularities.  Eye  movements  are 
simulated  in  detail  in  CHREST  but  not  the  low-level  aspect  of  pereeption.  (We  will  deal 
with  the  relation  between  problem  solving  and  pereeption  in  See.  3.2.) 
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A  more  detailed  simulation  of  low-level  aspects  of  perception,  such  as  feature 
extraction,  is  an  important  goal  for  the  future  of  research  on  the  relation  of  perception  to 
other  aspects  of  cognition.  In  addition,  little  work  has  been  done  on  modeling  perception  in 
dynamically  changing  environments  and  on  the  effects  of  stress,  emotion,  motivation,  and 
group  factors  on  perception. 

It  is  useful  to  separate  perception  from  cognition  in  modeling  human  performance.  The 
border  between  the  model  of  the  person  and  their  environment  can  (arguably)  be  drawn  at 
the  boundary  between  cognition  and  perception,  with  perception  belonging  to  a  large  extent 
in  the  environment  model.  This  may  be  true  for  psychological  reasons  (Pylyshyn,  1999).  It 
is  also  true  to  support  tying  models  to  simulations  and  for  use  of  the  resulting  knowledge  by 
cognition  in  problem  solving  (Ritter,  Baxter,  Jones,  &  Young,  2000).  The  typical  acts 
performed  by  perception  and  motor  action,  such  as  determining  the  objects  in  view,  their 
shapes  and  sizes,  and  then  manipulating  them,  are  most  easily  performed  where  the  objects 
reside.  This  forces  the  implementation  of  theories  of  interaction  into  the  simulation  language 
instead  of  the  modeling  language. 

It  would  be  useful  to  have  realistic  stochastic  distributions  of  differences  in  perception 
among  individual  agents,  and  also  the  ability  to  augment  perception  with  instruments  from 
field  glasses  to  night  sights.  These  devices  could  be  modeled  as  plug-ins  to  the  perception 
model.  Models  of  perception  in  synthetic  environments  are  typically  simple,  being  a 
function  of  distance  from  observer  to  object  (e.g.,  if  there  is  a  clear  line  of  sight  and  the 
absence  of  cover  and  smoke).  On  the  other  hand,  human  vision  changes  in  important  ways 
with  the  ambient  level  of  light  and  with  the  part  of  the  retina  on  which  an  image  falls.  The 
edges  of  the  retina  are  particularly  sensitive  to  the  detection  of  a  moving  object,  while  the 
fovea  has  the  best  resolution  for  identifying  distant  objects  and  is  most  sensitive  to  color. 
The  distance  at  which  an  object  can  be  seen  depends  on  its  brightness,  its  size,  and  its 
contrast  to  the  background  as  well  as  the  permeability  of  the  air  to  light.  Thus,  a  detonation 
will  be  visible  from  a  much  greater  range  than  a  moving  tank,  which  in  turn  will  be  much 
easier  to  spot  than  a  motionless,  camouflaged  soldier. 

Situation  awareness  is  a  term  that  is  still  the  subject  of  much  debate  in  the  human 
factors  and  ergonomics  communities  (e.g.,  see  the  Special  Issue  oi Human  Factors,  Volume 
37,  Issue  1).  Pew  and  Mavor  (1998)  consider  situation  awareness  to  be  a  key  concept  in  the 
understanding  of  military  behavior.  Wc  agree,  but  also  believe  that  situation  awareness 
should  be  modeled  at  a  finer  level  of  detail  than  is  currently  often  done  (see  Pew  &  Mavor, 
1998,  chap.  7,  for  a  current  review). 

3.2  Combining  Perception  and  Problem  Solving 

Pew  and  Mavor  (1998)  note  that  an  important  constraint  on  problem  solving  is 
perception,  but  do  not  explore  this  in  detail.  As  mentioned  in  our  discussion  on  expertise, 
perception  plays  an  important  role  in  skilled  behavior — experts  sometimes  literally  see  the 
solution  to  a  problem  (de  Groot,  1946/1978). 

We  may  use  Kosslyn  and  Koenig’s  (1992)  definition:  higher-level  visual  processing 
involves  using  previously  stored  information;  lower-level  visual  processing  does  not  involve 
such  stored  information  and  is  driven  only  by  the  information  impinging  on  the  retina.  We 


20 


Human  Systems  lAC  SOAR,  2003 


Chapter  3.  Current  Objective:  Better  Integration 

focus  here  on  higher-level  perception  and,  thus,  we  will  not  consider  mechanisms  used  for 
finding  edges,  computing  depth,  and  so  on. 

Neisser’s  Cognition  and  Reality  (1976)  describes  what  is  often  referred  to  as  the 
perceptual  cycle.  This  approach  underpins  a  vast  amount  of  the  cognitive  engineering 
literature  and  research.  At  its  simplest,  the  perceptual  cycle  is  a  cycle  between  the 
exploration  of  reality  and  representing  this  reality  as  schemas  (in  the  general  sense). 
Schemas  direct  exploration  (perceptual,  haptie,  etc.)  that  involves  sampling  the  object 
(looking  at  the  real  world),  which  may  alter  the  object,  which  means  that  the  schemas  have 
to  be  modified.  (See  Neisser,  1976,  p.  21,  or  p.  1 12  for  a  more  complete  description.)  This 
work  suggests  that  an  important  aspect  of  behavior  has  been  missing  from  many  theories 
and  models  of  problem  solving  that  have  not  included  perception. 

It  is  natural  that  researchers  have  attempted  in  recent  years  to  combine  perception  and 
problem  solving  in  artificial  systems.  One  can  single  out  three  main  approaches:  robotics, 
problem-solving  architectures  incorporating  perception,  and  perceptual  architectures  being 
extended  to  problem  solving. 

In  robotics,  Nouvelle  A1  has  attempted  to  build  robots  able  to  carry  simple  problem¬ 
solving  behavior  without  explicit  planning  by  linking  sensor  and  motor  abilities  tightly  (e.g., 
the  behavior-based  architecture  of  Brooks,  1992).  Robots  based  on  this  approach  are 
excellent  at  obstacle-avoiding  behavior.  It  is,  however,  unclear  how  far  this  approach  can  be 
extended  to  more  complex  problem  solving  without  incorporating  some  sort  of  planning. 

Including  perception  in  behavioral  models  is  a  useful  way  to  add  natural  eompeteneies 
and  limitations  to  behavior.  Pew  and  Mavor  note  that  there  are  few  models  of  how 
perception  influences  problem  solving.  Their  summary  can  be  extended  and  revised  in  this 
area,  however.  We  have  seen  in  existing  cognitive  models  (Bynie,  2001;  Chong,  2001;  de 
Groot  &  Gobet,  1996;  Gobet,  1997;  Jones,  Ritter,  &  Wood,  2000;  Ritter  &  Bibby,  2001; 
Salvueci,  2001)  and  in  A1  models  (Elliman,  1989;  Grimes,  Pieton,  &  Elliman,  1996;  St. 
Amant  &  Riedl,  2001)  that  perception  is  linked  to  and  can  provide  behavioral  eompeteneies 
and  restrictions  on  problem  solving.  While  Pew  and  Mavor  note  that  they  are  unaware  of 
any  attempt  in  Soar  to  model  the  detailed  visual  perceptual  processes  in  instrument  scanning 
(Pew  &  Mavor,  1998,  p.  181),  such  models  exist  (Aasman,  1995;  Aasman  &  Miehon,  1992; 
Bass  et  al.,  1995),  and  some  are  even  cited  by  Pew  and  Mavor  (1998,  p.  95)  for 
other  reasons. 

The  Soar  model  reported  by  Bass  et  al.  (1995)  scans  a  simple  air-traffie  control  display 
to  find  wind  velocity.  The  model  learns  (chunks)  this  information  and  uses  it  and  the  display 
to  track  and  land  a  plane  through  airport  air  traffic  control.  The  model  then  reflects  on  what 
it  did  to  consider  a  better  course  of  action.  This  model  shows  tentative  steps  towards  using 
Soar’s  learning  mechanism  for  situation  learning  and  assessment  based  on  information 
acquired  through  active  perception  (see  Pew  &  Mavor,  1998,  p.  197).  Modeling  visual 
cognition  within  Soar  is  ongoing  at  the  University  of  Southern  California’s  Information 
Sciences  Institute  (USC/ISl;  Hill,  1999)  and  at  the  Pennsylvania  State  University. 

The  EPAM  architecture  (Feigenbaum  &  Simon,  1984),  the  initial  goal  of  which  was  to 
model  memory  and  perception,  has  recently  been  extended  into  a  running  production  system 
(Gobet  &  Jansen,  1994;  Lane,  Cheng,  &  Gobet,  1999).  The  chunks  learned  while  interacting 
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with  the  task  environment  can  later  be  used  as  conditions  of  productions.  The  same  chunks 
are  also  used  for  the  creation  of  schemas  and  for  directing  eye  movements. 

Recently,  there  have  been  several  attempts  to  move  the  perception  component  from 
models  into  the  architectures,  regularizing  and  generalizing  the  results  in  the  process. 
Prominent  cognitive  architectures  Soar  and  ACT-R  have  been  extended  to  incorporate 
perceptual  modules,  and  PSI  also  has  a  perceptual  module.  With  Soar,  a  perceptual  module 
is  available  based  on  EPIC  (Chong  &  Laird,  1997)  and  another  based  loosely  on  a  spotlight 
theory  of  attention  (Ritter  et  al.,  2000).  With  ACT-R,  two  perceptual  modules  have  been 
developed  independently:  the  Nottingham  architecture  (Ritter  et  al.,  2000)  and  ACT-R/PM 
(based  on  but  also  extending  EPIC;  Byrne,  1997,  2001).  This  approach  creates  situated 
models  of  cognition,  that  is,  models  that  interact  with  (simulations  of)  the  real  world. 

None  of  these  approaches  has  been  tested  with  complex,  natural,  and  dynamically 
changing  environments.  The  robotics  approach  is  the  only  one  currently  demonstrated  to 
cope  with  natural,  albeit  rather  simple,  environments.  The  two  other  approaches  can  interact 
with  computer  interfaces  that  are  complex  and  dynamic  (e.g.,  Salvucci,  2001). 

3.3  Integration  of  Psychology  Theories 

A  glance  at  almost  any  psychology  textbook  reveals  that  the  study  of  human  cognition 
is  conventionally  divided  into  topics  that  are  presented  as  if  they  have  little  to  do  with  each 
other.  There  will  be  separate  chapters  on  attention,  memory,  problem  solving,  and  so  on. 
However,  the  range  and  variety  of  tasks  undertaken  by  people  at  work,  and  also  those 
tackled  by  synthetic  agents,  typically  require  the  application  and  interplay  of  many  different 
aspects  of  cognition  simultaneously  or  in  close  succession.  Interacting  with  a  piece  of 
electronic  equipment,  for  example,  can  draw  upon  an  agent’s  capacity  for  perception, 
memory,  learning,  problem  solving,  motor  control,  decision  making,  and  many  more 
capabilities.  The  question  of  how  to  integrate  these  different  facets  of  cognition  is  therefore 
an  important  one  for  the  simulation  of  human  behavior. 

Integrating  theories  across  different  topics  of  cognition  is  an  issue  that  has  rarely  been 
addressed  directly  and  provides  an  important  focus  for  future  work.  Agents  in  synthetic 
environments  (e.g.,  R.  Jones,  Laird,  Nielsen,  Coulter,  Kenny,  &  Koss,  1999)  implicitly 
integrate  multiple  aspects  of  behavior.  What  research  exists  has  been  carried  out, 
appropriately  enough,  under  the  heading  of  unified  theories  of  cognition  using  architectures 
such  as  Soar  and  ACT-R.  Soar  offers  a  promising  basis  for  such  integration.  Its  impasse- 
driven  organization  enables  it  to  access  different  areas  of  cognitive  skill  as  the  need  arises, 
and  its  learning  mechanism  (which  depends  on  cognitive  processing  in  those  impasses) 
enables  relevant  information  from  the  different  areas  to  be  integrated  into  directly  applicable 
knowledge  for  future  use.  ACT-R  also  integrates  multiple  components. 

3.4  Integration  and  Reusability  of  Models 

Integration  of  theories  can  be  also  viewed  as  integration  of  models  as  software, 
sometimes  called  reuse.  It  has  been  true  for  years  that  reuse  is  important;  this  is  true  for  two 
fundamental  reasons.  First,  reuse  saves  effort.  In  the  field  of  object-oriented  software 
development,  figures  are  often  quoted  for  the  costs  associated  with  development  with  reuse 
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in  mind.  The  extra  time  spent  in  initial  development  is  something  like  20%.  When  the  code 
is  reused,  an  application  ean  be  created  in  40%  of  the  development  time  for  new  code. 
Second,  and  perhaps  more  importantly  in  these  domains,  reuse  ensures  consistency  across 
simulations  and  time,  particularly  important  when  creating  unified  theories  of  cognition. 

There  are  also  serious  problems  restricting  the  reuse  of  cognitive  models.  Cognitive 
models  are  not  generally  reused,  even  when  they  have  been  created  in  a  cognitive 
architcclurc  designed  to  facilitate  their  reuse.  There  are  exceptions.  Pearson’s  Version  2  of 
his  Symbolic  Concept  Acquisition  model  and  its  explanatory  displays  is  an  exception 
(available  at  ai.cees.umich.edu/soar/soar-group.html).  Other  exceptions  include  PDP  toolkits 
such  as  O’Reilly’s  PDP++  (www.cnbc.cmu. edu/PDP++/PDP++.html).  But,  overall, 
cognitive  modeling  does  not  have  the  level  of  system  reuse  and  visual  displays  that  the  A1 
and  expert  systems  communities  now  take  for  granted.  This  problem  is  being  noticed  by 
others  as  well  (Wray,  2001). 

rhere  are  some  examples  of  reuse  that  should  be  emulated  and  expanded.  ACT-R  now 
maintains  a  library  of  existing  models  (act.psy.cmu.edu).  We  have  found  that  the  mere 
existence  of  a  library  of  student  models  (www.nottingham.ac.uk/pub/soar/nottingham/)  has 
led  to  increasingly  better  student  projects.  Work  by  Young  (1999)  on  building  a  zoo  of 
runnable  cognitive  models  is  another  example  of  such  use  done  broadly.  There  is  little 
reason  to  believe  that  these  results  would  not  scale  up.  These  improvements  to  the  modeling 
environment  have  helped  move  learning  Soar  (Ritter  &  Young,  1999)  and  ACT-R 
(Anderson  &  Lebierc,  1998)  from  being  a  lengthy  apprenticeship  to  being  something  that 
ean  be  taught  in  undergraduate  courses. 

Such  integration  is  illustrated  most  clearly  in  a  model  of  natural  language  sentence 
processing  (Lewis,  1993),  in  which  lexical,  syntactic,  semantic,  pragmatic,  and  domain- 
speeifie  knowledge  arc  brought  together  in  learned  rules  (Soar  chunks)  to  guide  language 
comprehension.  Probably  the  model  that  has  gone  furthest  in  demonstrating  this  kind  of 
integration  is  the  cognitive  model  of  the  NASA  Test  Director,  the  person  responsible  for 
coordinating  the  preparation  and  launch  of  the  space  shuttle.  Nelson,  Lehman,  and  John 
(1994)  describe  a  Soar  model  of  a  fragment  of  the  Test  Director’s  pcrfomianee,  which 
incorporates  problem  solving,  listening  to  audio  communications,  understanding  language, 
speaking,  visual  scanning  (through  a  procedure  manual),  page  turning,  and  more.  Such 
integrated  models  arc  also  starting  to  be  created  in  ACT-R  (Anderson  &  Lebiere,  1998). 

Integration  of  a  slightly  different  flavor — across  capabilities  rather  than  across  textbook¬ 
like  topics  of  cognition  is  illustrated  in  another  Soar  model,  this  one  being  of  exploratory 
learning  of  an  interactive  device  (Rieman  et  al.,  1996).  At  first  glance,  it  might  seem  that 
exploratory  learning  is  not  especially  relevant  to  the  human  behavior  that  is,  apart  from 
questions  of  training,  the  main  foeus  of  this  report.  Fighter  pilots  and  tank  commanders  are 
highly  trained  and  expert  individuals,  and  presumably  do  not  learn  significantly  from  further 
experiences.  However,  component  skills  such  as  comprehending  a  novel  situation,  looking 
around  to  discover  relevant  options,  and  assessing  a  course  of  action  which  are 
fundamental  components  of  expert  skill — are  also  precisely  what  arc  required  for 
exploratory  learning  and  reactive  planning  in  uncertain  environments. 

Rieman  ct  al.  (1996)  describe  the  IDXL  model,  which  models  an  experienced  computer- 
user  employing  exploratory  learning  to  discover  how  to  perfonn  specified  tasks  with  an 
unfamiliar  software  application.  IDXI.  searches  both  the  external  space  provided  by  the 
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software  and  the  internal  space  of  potentially  relevant  knowledge.  It  seeks  to  comprehend 
what  it  finds  and  approximates  the  rationally  optimal  strategy  (Anderson,  1990)  for 
exploratory  search.  A  typical  sequence  of  interrelated  capabilities  would  be  for  the  model 
first  to  learn  how  to  start  a  spreadsheet  program  from  external  instruction;  then  to  use  that 
new  knowledge  as  a  basis  for  analogy  to  discover  how  to  start  a  graph-drawing  package; 
and  then  to  build  on  its  knowledge  by  learning  through  exploration  how  to  draw  a  graph. 
The  model  works  with  a  limited  working  memory,  employs  recognition-based  problem 
solving  (Howes,  1993),  and  acquires  display-based  skill  (Payne,  1991)  in  an  interactive, 
situated  task. 

These  problems  of  reusability  are  even  more  acute  when  creating  models  for  synthetic 
environments  because  of  the  size  and  type  of  models.  This  is  true  for  several  reasons:  the 
knowledge  is  more  extensive  and  exact  than  many  laboratory  domains  previously  studied. 
The  models  must  interact  with  complex,  interactive  simulations.  The  work  may  be 
classified,  which  will  add  an  additional  constraint  in  hiring  someone  with  multiple  skills. 
Scenarios  may  simulate  hours  of  behavior  rather  than  the  minutes  of  typically  modeled 
laboratory  tasks.  This  represents  a  lot  of  knowledge,  and  the  timeframe  can  make 
troubleshooting  more  difficult.  Finally,  there  are  many  cases  where  an  explanation  facility  is 
required  to  explain  the  model’s  behavior  for  other  observers. 

3.5  Summary 

A  framework  to  assist  with  integration  and  reuse  will  have  to  be  developed.  It  should  be 
common  in  the  sense  that  the  appropriate  simulation  entities  and  analysis  tools  would  be 
available,  and  for  a  given  application  or  analysis,  developers  would  plug  them  together.  The 
DIS  protocol  and  ModSAF  are  being  used  in  this  way  to  some  extent,  but  they  are  hard  to 
use  and  do  not  support  the  desired  level  of  ease  of  use  nor  the  level  of  cognitive  realism. 
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Current  Objective:  Improved  Usability 

In  addition  to  improving  the  match  of  synthetic  forces  to  human  behavior  itself,  there 
are  several  aspects  of  these  models  that  must  be  improved  so  they  can  be  developed,  tested, 
and  used  by  modelers  and  analysts.  A  large  amount  of  time  is  often  required  to  build  models 
and  understand  their  behavior,  more  than  we  believe  should  be  necessary.  The  difficulties  of 
simply  creating  and  manipulating  models  of  behavior  can  preclude  us  from  spending  more 
time  developing  and  testing  models,  and  using  these  models  in  training  or  for  performing 
“what“if  ’  analyses. 

While  Pew  and  Mavor  (1998,  p.  10)  initially  note  that  their  report  will  not  address 
usability,  they  later  (p.  282)  note  the  need  to  have  quickly  reconfigurable  models.  They  also 
discuss  (p.  292)  ease  of  use.  This  revision  is  completely  appropriate  because  usability  is 
important.  Models  that  are  too  difficult  to  be  used  are  not  used.  This  issue  is  also  being 
taken  up  in  the  next  generation  of  simulation  models  in  the  United  States  (Ceranowicz, 
1998). 

4.1  Usability  of  the  Models 

As  we  have  noted  before  (Ritter,  Jones,  &  Baxter,  1998b;  Ritter  &  Larkin,  1994), 
cognitive  models  suffer  from  usability  problems.  Few  lessons  from  the  field  of  Human- 
Computer  Interaction  (HCl)  have  been  re-applied  to  increase  the  understanding  of  the 
models  themselves,  even  though  many  results  and  techniques  in  HCl  have  been  discovered 
using  cognitive  modeling. 

Modelers  have  to  interact  with  the  model  several  times  and  in  several  ways  over  the 
lifetime  of  the  model.  As  a  first  step,  the  models  must  be  easy  to  create.  As  part  of  the 
creation  and  validation  process,  the  models  must  be  debugged  on  the  syntactic  level  (will  it 
run?),  on  the  knowledge  level  (docs  it  perform  the  task?),  and  on  a  behavioral  level  (docs  it 
perform  the  task  like  a  human?).  All  of  these  levels  are  important  if  the  costs  of  acquiring 
behaviors  are  to  be  redueed.  While  we  can  point  to  some  recent  advances  in  usability 
(Anderson  &  Lebiere,  1998;  Jones,  1999b;  Kalus  &  Hirst,  1999;  Ritter  et  al.,  1998b),  further 
work  will  be  required. 

It  is  also  probably  fair  to  say  that  cognitive  models  can  often  be  difficult  to  explain  and 
understand.  This  problem  has  been  noted  as  a  result  in  a  recent  Air  f  orce  model  comparison 
exercise,  AMBR,  covered  in  more  detail  in  Section  6.2.7  (Gluck  &  Pew,  2001a).  The 
diffieulty  in  understanding  a  model’s  behavior  is  partially  due  to  their  complexity,  but  it  is 
compounded  at  times  by  the  difficulty  of  their  interfaces  not  supporting  the  models  in  a 
structured  way,  not  displaying  the  model’s  state,  and  not  supporting  exploration  of  the 
model’s  state.  In  many  cases  this  is  not  intentional,  but  arises  out  of  the  modeling  languages 
youth  as  programming  languages,  and  that  support  for  usability  takes  time  away  from 
applications  and  modeling  itself 
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4.2  Desired  Accuracy  of  the  Models 

Another  problem  is  knowing  when  to  stop  improving  the  model.  In  science  for  seienee’s 
sake,  there  is  no  limit — the  model  is  continually  improved.  In  the  case  of  engineering-likc 
applieations,  sueh  as  behavioral  models  in  synthetie  environments,  knowing  when  to  stop  is 
a  valid  question.  In  many  cases  we  do  not  know  how  accurate  these  models  have  to  be  in 
order  to  be  useful  and  at  what  point  additional  accuracy  is  no  longer  worthwhile.  For 
example,  does  having  an  emotional,  simulated  opponent  lead  to  better  or  worse  training? 

The  purpose  and  goals  of  eaeh  modeling  projeet  will  help  determine  when  to  stop 
development,  so  they  need  to  be  earefully  laid  out  when  developing  a  model  of  behavior. 
The  stopping  rule  also  applies  to  the  synthetie  environment  as  well  as  the  model  there  is 
no  point  in  developing  a  simulation  that  is  too  detailed.  This  question  is  becoming  more 
important  as  the  models  beeome  more  aeeurate  and  modifiable. 

4.3  Aggregation  and  Disaggregation  of  Behaviors 

A  clear  requirement  for  simulations  in  synthetie  environments  is  the  ability  to  aggregate 
or  summarize  subunits  and,  in  other  situations,  the  ability  to  disaggregate  and  plaee  the 
subunits  from  a  larger  grouping.  When  the  tanks  in  a  platoon  are  eaeh  simulated  in  a 
platform-level  simulation,  they  must  be  aggregated  to  display  them  as  a  platoon  on  a  more 
abstract  or  larger-seale  map.  Similarly,  higher-level  units  may  have  to  be  placed  into  a 
simulation  when  moving  a  larger  unit  into  a  platform-level  simulation.  This  aggregation  (or 
disaggregation)  may  need  to  occur  multiple  times  when  crossing  levels  of  resolution  to 
provide  the  right  level  for  a  report. 

This  area  has  received  a  limited  amount  of  study,  yet  it  is  a  eommon  need  across 
multiple  types  of  simulations.  None  of  the  eognitive  arehiteetures  examined  in  Pew  and 
Mavor  (1998,  Table  3.1)  or  here  offer  any  insight.  We  ean  only  note  that  several  of  the 
arehiteetures  (e.g.,  COGNET,  Soar)  arc  designed  to  support  multiple  agents. 

4.4  Summary 

Environments  for  interacting  with  existing  modeling  arehiteetures  are  generally  poorer 
than  those  now  provided  for  most  programming  languages.  The  requirements  for  modeling 
are  greater  than  general  programming,  including  the  need  for  adjustable  accuracy,  different 
levels  of  analyses,  and  multiple  measurements  from  running  programs.  These  factors 
contribute  to  making  modeling  difficult.  We  need  new  models  and  new  techniques  for 
building  and  using  models. 
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Recent  Developments  for  Modeling 

In  addition  to  the  architectures  and  approaches  identified  by  Pew  and  Mavor  (1998), 
there  are  a  few  other  architectures  that  arc  worth  examining.  In  this  chapter  we  note  them, 
including  the  lessons  they  provide.  Our  reviews  also  explicitly  consider  ease  of  use 
(i.e.,  model  populating). 

Wc  focus  our  comments  on  cognitive  architectures  because  they  have  been  created  for 
modeling  the  strengths  and  limitations  of  human  behavior.  Any  system  built  for  other 
reasons  that  was  adapted  in  this  way — for  example,  other  AI  systems — would  start  to 
approach  these  systems  in  capabilities  and  limitations.  It  is  quite  likely  that  the  cognitive 
architecture  that  best  matches  human  behavior  will  vary  by  the  type  of  behavior  and  level  of 
aggregation.  For  example,  different  architectures  will  be  preferred  for  modeling  a  soldier 
performing  simple  physical  tasks  than  for  a  deliberate  and  reflective  commander. 

There  will  continue  to  be  a  range  of  architectures  created.  Wc  agree  completely  with 
Pew  and  Mavor  (pp.  110-111)  that  further  work  is  necessary  before  settling  on  an 
architecture.  That  is  not  to  say  that  architectures  will  not  continue  to  converge  (e.g..  Soar 
and  EPIC,  Chong,  2001,  and  Soar  and  ACT-R,  Jones,  1998).  We  start,  however,  by 
examining  ways  to  summarize  data  and  some  advanced  AI  techniques  to  help  create  models. 
We  then  examine  several  architectures. 

5.1  Data  Gathering  and  Analysis  Techniques 

Scattered  throughout  Pew  and  Mavor  (e.g.,  pp.  323-325)  are  comments  about  the  need 
for  data  to  develop  and  test  models.  Data  to  develop  models  can  come  from  a  wide  variety 
of  sources.  Data  can  come  from  speaking  to  experts  and  having  them  do  tasks  off-line,  so- 
called  knowledge  acquisition  (Chipman  &  Meyrowitz,  1993;  Schraagen,  Chipman,  & 
Shalin,  2000;  Shadbolt  &  Burton,  1995).  Data  can  also  come  from  having  experts  talk  aloud 
while  performing  the  task  (Ericsson  &  Simon,  1993).  Talking  aloud  is  a  more  accurate  way 
to  acquire  the  knowledge  because  it  is  based  on  actual  behavior  rather  then  someone’s 
impression  and  memory  of  behavior.  It  is,  however,  a  more  costly  approach  because  the 
modeler  must  infer  the  behavior  generators.  Data  for  developing  models  can  also  come  from 
non-verbal  measurements  of  experts  while  they  perform  the  task.  Non-verbal  measurements 
arc  probably  the  least  useful  data  (but  still  useful  in  some  circumstances)  for  developing 
models.  These  data  arc  useful,  however,  in  testing  models  that  make  timing  predictions. 
Data  can  also  come  from  previously  run  studies,  reviews,  and  compendia  of  such  studies 
(e.g.,  Boff  &  Lincoln,  1988  SeKular  &  Blake,  1994).  A  useful  review  of  data  types  and 
analysis  methods  in  this  area  is  provided  by  Hoffman  (1987). 

A  major  requirement  will  be  a  balance  between  the  experimental  control  of  the  lab  and 
the  richness  of  the  real  world.  An  appropriate  balance  can  sometimes  be  achieved  by 
gathering  data  in  the  same  micro-world  simulations  in  which  the  models  will  be  deployed. 
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such  as  synthetic  environments.  These  environments  can  be  used  to  model  all  the  salient 
aspects  of  the  real  world,  while  still  providing  some  level  of  experimental  control. 

Once  the  data  are  in  hand,  they  will  often  have  to  be  aggregated  or  summarized.  Expert 
summaries  from  knowledge  acquisition  already  represent  summarized  data,  but  the  field  of 
verbal  protocol  analysis  has  developed  a  wide  range  of  techniques  for  summarizing  such 
data. 

Reviews  and  suggestions  in  this  area  are  available  (Ericsson  &  Simon,  1993;  Sanderson 
&  Fisher,  1 994),  but  there  exists  a  very  wide  range  of  techniques  that  vary  based  on  how 
advanced  the  theory  is,  the  purposes  of  the  research,  and  the  domain.  Survival  analysis  is 
one  example  of  an  advanced  technique  to  examine  protocol  data  for  temporal  patterns  for 
later  inclusion  and  comparison  against  model  behavior  (Kuk,  Arnold,  &  Ritter,  1 999). 

With  data  in  hand,  the  next  step  is  either  to  develop  a  model  or  to  test  an  existing  model. 
There  is  little  formal  methodology  about  how  to  create  models.  Some  textbooks  attempt  to 
teach  this  creative  task  either  directly  (vanSomeren,  Barnard,  &  Sandberg,  1994)  or  by 
example  (McClelland  &  Rumelhart,  1988;  Newell  &  Simon,  1972).  There  are  summaries  of 
the  testing  process  (Ritter  &  Larkin,  1994)  and  of  some  possible  tests  (Ritter,  1993a). 
Tenney  and  Spector  (2001);  and  Ritter  and  Bibby  (2001)  provide  particularly  useful 
example  sets  of  comparisons.  Repairing  a  model  based  on  the  results  of  the  tests  can  be  a 
task  requiring  a  lot  of  creativity. 

5.2  Advanced  Al  Approaches 

There  are  some  existing  AI  tools  that  could  be  used  to  create,  augment,  or  optimize 
models  of  performance.  We  note  here  three  tools  with  which  we  are  particularly  familiar. 
These  include  approaches  for  creating  behaviors,  such  as  genetic  algorithms  and  traditional 
Al-planning  programs. 

5.2.1  Genetic  Algorithms 

Genetic  Algorithms  (GAs)  are  search  methods  that  can  be  used  in  domains  in  which  no 
heuristic  knowledge  is  available  and  an  objective  function  exhibits  high  levels  of 
incoherence  (Goldberg,  1989).  That  is  to  say,  a  small  change  to  the  solution  state  may  often 
result  in  large  changes  to  the  objective  function  or  fitness  measure.  These  algorithms  arc 
expensive  in  machine  resources  and  exhibit  slow  (but  often  steady)  convergence  to  a 
solution.  They  might  be  used  as  a  search  strategy  of  last  resort  for  plan  formation. 

GAs  are  a  family  of  algorithms  loosely  based  on  Darwinian  evolution.  They  optimize 
functions  without  assuming  that  the  search  space  will  be  linear.  They  start  with  a 
population  of  templates  for  possible  solutions  (analogous  to  sets  of  chromosomes),  and 
evaluate  them  to  determine  how  well  they  perform  (fitness).  After  the  fitness  values  are 
computed,  a  new  population  is  created.  A  variety  of  methods  have  been  used  to  create  the 
next  generation,  but  in  each  case  the  underlying  principle  has  been  to  include  copies  of  the 
chromosomes  proportional  to  their  fitness,  and  at  each  generation  to  create  new 
combinations  by  combining  two  parents’  chromosomes.  The  cycles  of  evaluation  and 
creation  arc  then  repeated. 
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Heuristics  can  be  used  with  GAs  to  seed  the  initial  population  in  a  non-random  way  or 
to  guide  the  crossover  process  in  a  way  that  changes  the  distribution  of  offspring.  Using 
heuristics  results  in  a  memetic  algorithm  (one  that  manipulates  basic  blocks  of  information 
or  mernes).  As  has  been  common  experience  throughout  the  history  of  AI,  this  introduction 
of  domain  knowledge  can  drastically  transform  the  performance  of  the  GA.  Such  algorithms 
have  been  found  to  exceed  the  performance  of  previous  approaches  in  a  number  of  domains 
(Burke,  Elliman,  &  Weare,  1995).  There  may  be  reason  for  using  GAs  as  a  search  strategy 
in  planning. 

5.2.2  Tabu  Search 

Tabu  search,  as  developed  by  Glover  (Glover  &  Laguna,  1998),  is  a  general  purpose 
approach  remarkably  effective  for  difficult  problems  where  the  objective  function  has  some 
local  coherence.  It  is  surprising  how  often  hill-climbing  approaches  such  as  the  A* 
algorithm  are  used  in  current  plan-building  algorithms,  despite  the  domains  being  prone  to 
local  maxima.  Tabu  search  uses  the  novel  concept  of  recency  memory  to  prevent  moves  in  a 
solution  space  from  being  tried  when  some  component  of  that  state  has  recently  been 
changed  in  a  previous  move.  This  surprisingly  simple  idea  forces  the  search  away  from  a 
local  maximum.  Long-term  memory  is  used  to  hold  the  best  solution  state  found  so  far  and 
this  knowledge  may  be  used  to  restart  the  search  far  away  from  any  previous  exploration  of 
the  state  space. 

The  Tabu  search  approach  would  almost  certainly  lead  to  improved  solutions  with 
reasonable  computational  complexity.  It  would  be  worth  using  this  approach  to  search  for 
strategies  and  plans  at  various  levels  in  a  synthetic  environment  from  the  individual 
combatant  to  the  highest  level  source  of  command  and  control. 

Soar  is  impressive  in  its  ability  to  reuse  parts  of  problems  that  have  been  solved  in  the 
past  and  to  plan  in  a  goal-directed  way  that  can  seem  ingenious.  Real  human  problem 
solving  can  be  less  structured,  however,  and  ean  leap  from  one  approach  to  another  in  a 
manner  that  is  difficult  to  model.  Tabu  search  has  this  characteristic,  however,  as  part  of  its 
diversification  strategy.  Including  Tabu  search  in  a  cognitive  architecture  would  be 
interesting.  There  may  be  some  advantages  to  be  gained  by  grafting  on  other  similar  systems 
that  modify  the  beliefs  of  a  cognitive  architecture  so  as  to  maintain  various  types  of  logieal 
consistency  in  the  set  of  facts  held. 

5.2.3  Multiple  Criteria  Heuristic  Search^ 

Heuristic  search,  one  of  the  classic  techniques  in  AI,  has  been  applied  to  a  wide  range  of 
problem-solving  tasks  including  puzzles,  two-playcr  games,  and  path-finding  problems.  A 
key  assumption  of  all  problem-solving  approaches  based  on  utility  theory,  including 
heuristic  search,  is  that  we  can  assign  a  single  utility  or  cost  to  each  state,  fhis,  in  turn, 
requires  that  all  criteria  of  interest  can  be  reduced  to  a  common  ratio  scale. 
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The  route-planning  problem  has  conventionally  been  formulated  as  one  of  finding  a 
minimum-cost  (or  low-cost)  route  between  two  locations  in  a  digitized  map,  where  the  cost 
of  a  route  is  an  indication  of  its  quality  (e.g.,  Campbell,  Hull,  Root,  &  Jackson,  1995).  In 
this  approach,  planning  is  regarded  as  a  search  problem  in  a  space  of  partial  plans,  allowing 
many  of  the  classic  search  algorithms  such  as  A*  (Hart,  Nilsson,  &  Raphael,  1968)  or 
variants  such  as  A*epsilon  (Pearl,  1982)  to  be  applied.  However,  while  such  planners  are 
complete  and  optimal  (or  optimal  to  some  bound  e),  formulating  the  route-planning  task  in 
terms  of  minimizing  a  single  criterion  is  difficult. 

For  example,  consider  the  problem  of  planning  a  route  in  a  complex  terrain  of  hills, 
valleys,  impassable  areas,  and  so  on.  A  number  of  factors  will  be  important  in  evaluating  the 
quality  of  a  plan:  the  length  of  the  route,  the  maximum  negotiable  gradient,  the  degree  of 
visibility,  and  so  on.  In  any  particular  problem,  some  of  these  criteria  will  affect  the 
feasibility  of  the  route,  while  others  are  simply  preferences.  Route  planning  is  an  example  of 
a  wide  class  of  multi-criteria,  problem-solving  tasks,  where  different  criteria  must  be  traded 
off  to  obtain  an  acceptable  solution. 

One  way  of  incorporating  multiple  criteria  into  the  problem-solving  process  is  to  define 
a  eost  function  for  each  criterion  and  use,  for  example,  a  weighted  sum  of  these  functions  as 
the  function  to  be  minimized.  We  can,  for  example,  define  a  visibility  cost  for  being  exposed 
and  combine  this  cost  with  cost  functions  for  the  time  and  energy  required  to  execute  the 
plan  to  form  a  composite  function  that  can  be  used  to  evaluate  alternative  plans.  However, 
the  relationship  between  the  weights  and  the  solutions  produced  is  complex  in  reality,  and  it 
is  often  unclear  how  the  different  cost  functions  should  be  combined  linearly  as  a  weighted 
sum  to  give  the  desired  behavior  across  all  magnitude  ranges  for  the  costs.  This  makes  it 
hard  to  specify  what  kinds  of  solutions  a  problem-solver  should  produce  and  hard  to  predict 
what  a  problem  solver  will  do  in  any  given  situation;  small  changes  in  the  weight  of  one 
criterion  can  result  in  large  changes  in  the  resulting  solutions.  Changing  the  cost  function  on 
a  single  criterion  to  improve  the  behavior  related  to  that  eriterion  often  leads  to  ehanging  all 
the  weights  for  all  the  other  costs  as  well  because  the  costs  are  not  independent.  Moreover, 
if  different  criteria  are  more  or  less  important  in  different  situations,  we  need  to  find  sets  of 
weights  for  each  situation. 

The  desirability  of  trade-offs  between  criteria  is  context-dependent.  In  general,  the 
properties  that  determine  the  quality  of  a  solution  arc  incommensurable.  For  example,  the 
criteria  may  only  be  ordered  (on  an  ordinal  scale),  with  those  criteria  that  determine  the 
feasibility  of  a  solution  being  greatly  preferred  to  those  properties  that  are  merely  desirable. 
It  is  difficult  to  see  how  to  convert  such  problems  into  a  multi-criterion  optimization 
problem  without  making  ad  hoc  assumptions.  It  is  also  far  from  clear  that  human  behavior 
solely  optimizes  on  a  single  criterion. 

Rather  than  attempt  to  design  a  weighted-sum  cost  function,  it  is  often  more  natural  to 
formulate  such  problems  in  terms  of  a  set  of  constraints  that  a  solution  should  satisfy.  We 
allow  constraints  to  be  prioritized,  that  is,  it  is  more  important  to  satisfy  some  constraints 
than  others,  and  soft,  that  is,  constraints  arc  not  absolute  and  can  be  satisfied  to  a  greater  or 
lesser  degree.  Such  a  framework  is  more  general  in  admitting  both  optimization  problems 
(e.g.,  minimization  constraints)  and  satisficing  problems  (e.g.,  upper-bound  constraints), 
which  cannot  be  modeled  by  simply  minimizing  weighted-sum  cost  functions.  Vicente 


30 


Human  Systems  (AC  SOAR,  2003 


Chapter  5.  Recent  Developments  for  Modeling 

(1998)  suggests  ways  in  which  such  constraints  can  be  analyzed  as  part  of  a  work 
domain  analysis. 

This  approach  to  working  with  constraints  provides  a  way  for  more  clearly  specifying 
problem-solving  tasks  and  more  precisely  evaluating  the  resulting  solutions.  There  is  a 
straightforward  correspondence  between  the  real  problem  and  the  constraints  passed  to  the 
problem-solver.  A  solution  can  be  characterized  as  satisfying  some  constraints  (to  a  greater 
or  lesser  degree)  and  only  partially  satisfying  or  not  satisfying  others.  By  annotating 
solutions  with  the  constraints  they  satisfy,  the  implications  of  adopting  or  executing  the 
current  best  solution  are  immediately  apparent.  The  annotations  also  facilitate  the 
integration  of  the  problem-solver  into  the  architecture  of  an  agent  or  a  decision-support 
system  (see  for  example,  Logan  &  Sloman,  1998).  If  a  satisfactory  solution  cannot  be  found, 
the  degree  to  which  the  various  constraints  are  satisfied  or  violated  by  the  best  solution 
found  so  far  can  be  used  to  decide  whether  to  change  the  order  of  the  constraints,  relax  one 
or  more  constraints,  or  even  redefine  the  goal,  before  making  another  attempt  to  solve 
the  problem. 

The  ordering  of  constraints  blurs  the  conventional  distinction  between  absolute 
constraints  and  preference  constraints.  All  constraints  are  preferences  that  the  problem- 
solver  will  try  to  satisfy,  trading  off  slack  on  a  more  important  constraint  to  satisfy  another, 
less  important  constraint. 

The  A*  search  algorithm  is  ill-suited  to  dealing  with  problems  formulated  in  tenns  of 
constraints.  Researchers  at  Birmingham  have  therefore  developed  a  generalization  of  A* 
called  A*  with  Bounded  Costs  (ABC;  Alechina  &  Logan,  1998;  Logan  &  Alechina,  1998), 
which  searches  for  a  solution  that  best  satisfies  a  set  of  prioritized  soft  constraints. 

The  utility  of  this  approach  and  the  feasibility  of  the  ABC  algorithm  have  been 
illustrated  by  an  implemented  route  planner  that  is  capable  of  planning  routes  in  complex 
terrain  satisfying  a  variety  of  constraints.  This  work  was  originally  motivated  by  difficulties 
in  applying  classical  search  techniques  to  agent-route  planning  problems.  However,  the 
problems  identified  with  utility-based  approaches,  and  the  proposed  solutions,  are  equally 
applicable  to  other  search  problems. 

5.3  Psychologically  Inspired  Architectures 

We  review  here  several  psychologically  inspired  cognitive  architectures  that  were  not 
covered  by  Pew  and  Mavor  (1998).  These  architectures  are  interesting  because  (1)  they  arc 
psychologically  plausible,  (2)  some  of  them  provide  examples  of  how  emotions  and 
behavioral  moderators  can  be  included,  and  (3)  several  illustrate  that  better  interfaces  for 
creating  cognitive  models  are  possible. 

5.3.1  Elementary  Perceiver  and  Memoriser 

The  Elementary  Pcrcciver  And  Memoriser  (EPAM)  is  a  well-known  computer  model  of 
a  wide  and  growing  range  of  memory  tasks.  The  basic  ideas  behind  EPAM  include 
mechanisms  for  encoding  chunks  of  information  into  long-term  memory  by  constructing  a 
discrimination  network.  The  EPAM  model  has  been  used  to  simulate  a  variety  of 
psychological  regularities,  including  the  learning  of  verbal  material  (Feigenbaum  &  Simon, 
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1962,  1984)  and  expert  digit-span  memory  (Riehman,  Staszewski,  &  Simon,  1995).  EPAM 
has  been  expanded  to  use  visuo-spatial  information  (Simon  &  Gilmartin,  1973). 

EPAM  organizes  memory  into  a  eolleetion  of  ehunks,  where  eaeh  ehunk  is  a  meaningful 
group  of  basie  elements.  For  example,  in  ehess,  the  basie  elements  are  the  pieees  and  their 
locations;  the  chunks  are  collections  of  pieces,  such  as  a  king-side  pawn  formation.  These 
chunks  arc  developed  through  the  processes  of  discrimination  and  familiarization. 
Essentially,  each  node  of  the  network  holds  a  chunk  of  information  about  an  object  in  the 
world.  The  nodes  are  interconnected  by  links  into  a  network  with  each  link  representing  the 
result  of  applying  a  test  to  the  object.  When  trying  to  recognize  an  object,  the  tests  are 
applied  beginning  from  the  root  node,  and  the  links  are  followed  until  no  further  test  can  be 
applied.  When  a  node  is  reached,  if  the  stored  chunk  matches  that  of  the  object  then 
familiarization  occurs.  The  chunk’s  resolution  is  then  increased  by  adding  more  details  of 
the  features  in  that  object.  If  the  current  object  and  the  chunk  at  the  node  reached  differ  in 
some  feature,  then  discrimination  occurs,  which  adds  a  new  node  and  a  new  link  based  on 
the  mismatched  feature.  Therefore,  with  discrimination,  new  nodes  are  added  to  the 
discrimination  network;  with  familiarization,  the  resolution  of  chunks  at  those  nodes 
is  increased. 

The  Chunk  Hicrarachy  and  REtrieval  STructures  (CHREST;  de  Groot  &  Gobet,  1996; 
Gobet  &  Simon,  1996b)  is  one  of  the  most  current  theories  of  memory  developed  from  the 
ideas  in  EPAM.  Gobet  and  Simon  (2000)  present  a  detailed  description  of  the  present 
version  of  CHREST  and  report  simulations  on  the  role  of  presentation  time  in  the  recall  of 
game  and  random  chess  positions.  As  in  the  earlier  chunking  theory  of  Chase  and  Simon 
(1973),  CHREST  assumes  that  chess  experts  develop  a  large  EP AM-like  net  of  chunks 
during  their  practice  and  study  of  the  game.  In  addition,  CHREST  assumes  that  some 
chunks,  which  recur  often  during  learning,  develop  into  more  complex  retrieval  structures 
(templates)  with  slots  for  variables  that  allow  a  rapid  encoding  of  chunks  or  pieces. 

EPAM  and  its  implementations  are  important  to  consider  because  they  fit  a  subset  of 
regularities  in  memory  very  well.  This  at  least  serves  as  an  example  for  other  theories  and 
architectures  to  emulate.  It  may  also  be  possible  to  include  the  essentials  of  EPAM  in 
another  system,  such  as  Soar  or  ACT-R,  extending  the  scope  of  both  approaches. 

5.3.2  Neural  Networks 

Pew  and  Mavor  (1998,  chap.  3)  review  neural  networks.  Here,  therefore,  we  only 
provide  some  further  commentary,  introduce  some  more  advanced  concepts,  and  note  a  few 
further  applications. 

Connectionist  systems  have  demonstrated  the  ability  to  learn  arbitrary  mappings. 
Architectures  such  as  the  Multi-Layer  Perceptron  (MLP)  arc  capable  of  being  used  as  a 
black  box  that  can  learn  to  recognize  a  pattern  of  inputs  as  a  particular  situation.  This 
requires  supervised  training  and  may  involve  heavy  computational  resources  to  arrive  at  a 
successful  solution  using  the  back-propagation  algorithm.  Training  ean  be  eontinued  during 
performance  as  a  background  task,  and  thus,  an  entity  could  have  an  ability  to  learn  during 
action  based  on  this  approach.  Recognition  performance  is  relatively  rapid  and  a  multi-layer 
perceptron  might  be  used  to  model  a  reaction  mechanism  in  which  a  combatant  responds  to 
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coming  under  fire,  or  spotting  the  presence  of  the  enemy,  for  example.  It  might  also  be  used 
to  activate  particular  aspects  of  military  doctrine  depending  on  the  current  circumstances. 

Recurrent  nets  such  as  the  Elman  (1991)  net  have  the  ability  to  generate  sequences  of 
tokens  as  output.  These  seem  to  offer  some  promise  of  detecting  an  input  situation  and 
producing  a  scries  of  behavioral  actions  as  a  response.  This  behavior  of  recurrent  nets  may 
be  useful  for  modeling  the  reactive  behavior  of  an  entity  over  a  short  time  period,  while  a 
symbolic  cognitive  model  is  used  for  the  higher-level  cognitive  processes  that  occur  over  a 
longer  time  span. 

5.3.3  Sparse  Distributed  Memories 

Subtle  issues  such  as  the  tip-of-the-tongue  phenomena  (Koriat  &  Lieblich,  1974)  and  the 
fact  that  we  know  if  we  know  something  (feeling  of  knowing)  before  becoming  aware  of  the 
answer  are  not  often  modeled  (although,  see  Schunn,  Rcder,  Nhouyvanisvong,  Richards,  & 
Stroffolino,  1997,  for  a  counter  example).  These  effects  may  be  captured  using  memory 
models  such  as  Kanerva’s  (1988)  Sparse  Distributed  Memory  (SDM),  which  has  been  put 
forward  as  a  plausible  model  of  brain  architecture,  particularly  the  cerebellum,  as  well  as  by 
Albus’s  (1971)  Cerebellar  Model  Arithmetic  Computer  (CM AC). 

The  way  in  which  a  combatant’s  experience  of  the  world  is  stored  and  modeled  is 
important.  An  SDM  seems  to  offer  powerful  human-like  ways  of  recalling  nearest  matches 
to  present  experience  in  a  best-first  manner.  They  have  the  interesting  property  of  storing 
memories  such  that  recall  works  by  finding  the  best  match  to  imperfect  data.  They  are  also  a 
natural  way  of  storing  sequences.  They  exploit  interesting  mathematical  properties  of  binary 
metric  spaces  with  a  large  number  of  dimensions.  It  is  intriguing  that  SDMs  have  the 
human-like  properties  that  they  “know  if  they  know“  something  before  the  retrieval  process 
is  complete.  They  also  exhibit  the  tip-of-the-tongue  phenomenon  and  replicate  the  human 
ability  to  recall  a  sequence  or  tune  given  the  first  few  items  or  notes.  They  can  also  learn 
actuator  sequences  that  might  be  used  in  muscle  control  or  rcllcx  patterns  of  behavior.  This 
can  even  be  seen  as  a  kind  of  thinking  by  analogy  that  has  a  uniquely  human-like  ability  to 
find  a  close  match  rapidly  without  exhaustive  or  even  significant  time  spent  in  search. 

5.3.4  PSI  and  Architectures  That  Include  Emotions 

PSl  is  a  relatively  new  cognitive  architecture  designed  to  integrate  cognitive  processes, 
emotions,  and  motivation  (Bartl  &  Domer,  1998).  The  architecture  includes  six  motives 
(needs  for  energy,  water,  pain  avoidance,  affiliation,  certainty,  and  competence).  Cognition 
is  modulated  by  these  motive/einotional  states  and  their  processes.  In  general,  PSI  organizes 
its  activities  similar  to  Rasmussen’s  (1983)  hierarchy:  first,  it  tries  highly  automatic  skills  if 
possible,  then  it  skips  to  knowledge-based  behavior,  and  as  its  ultima  ratio  approach  it  uses 
trial-and-error  procedures.  It  is  one  of  the  only  cognitive  architectures  that  we  know  about 
that  takes  modeling  emotion  and  motivation  as  one  of  its  core  tasks.  Its  source  code,  in 
Delphi  Pascal,  is  available  (www.uni-bamberg.de/ppp/instthcopsy/psi-software.html). 

A  model  in  the  PSI  architecture  has  been  tested  against  a  set  of  data  taken  from  a 
dynamic  control  task.  The  model  performed  the  same  task  and  its  number  of  control  actions 
was  within  the  range  of  human  behavior.  Its  predictions  of  summary  scores  were  outside  the 
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range  of  human  behavior — the  model  was  less  competent  (Detje,  2000) — but  single 
subjects  can  be  modeled  by  varying  starting  parameters  (Domcr,  2000).  In  such  a  complex 
task  as  the  “Island”  scenario  some  people  will  use  mcta-cognition  to  improve  their 
performance  (particularly  if  they  are  encouraged  to  think  aloud  as  they  were  in  Detje’s 
study).  The  same  data  could  reveal  that  these  subjects  profit  from  meta-cognition  and  that 
their  behavior  then  differs  from  what  is  implemented  currently  in  PSI  (sec  Bartl,  2000,  for 
a  more  detailed  explanation). 

This  model  needs  to  be  improved  before  it  matches  human  emotional  data  as  well  as 
other  cognitive  models  match  non-cmotional  data.  It  is,  however,  one  of  the  few  models  of 
emotion  compared  with  data. 

The  PSI  architecture  is  currently  incomplete,  which  raises  interesting  questions  about 
how  to  judge  a  nascent  architecture.  PSI  does  not  have  a  large  enough  user  community  and 
has  not  been  developed  long  enough  to  have  a  body  of  regularities  to  be  compared  with  let 
alone  adjusted  to  fit.  How  can  PSI  be  compared  with  the  older  architectures  with  existing 
tutorials,  user  manuals,  libraries  of  models,  and  example  applications? 

Several  other  models  of  emotions  and  architectures  that  use  emotions  have  been  created. 
Reviews  of  emotional  models  (Hudlicka  &  Fellous,  1996;  Picard,  1997)  typically  present 
models  and  architectures  that  have  not  been  compared  and  validated  against  human  data. 
There  appears  to  be  one  other  exception,  an  unpublished  PhD  thesis  by  Araujo  at  the 
University  of  Sussex  (cited  in  Picard,  1997).  Some  of  us  are  attempting  to  add  several 
simple  emotions  to  ACT-R  (Belavkin,  2001;  Belavkin  el  al.,  1999)  and  validate  the  model 
by  comparing  the  revised  model  with  an  existing  model  and  comparable  data  (G.  Jones, 
Ritter,  &  Wood,  2000). 

5.3.5  COGENT 

COGENT  is  a  design  environment  for  creating  cognitive  models  and  architectures 
(Cooper  &  Fox,  1998).  It  allows  the  user  to  draw  box-and-arrow  diagrams  to  structure  and 
illustrate  the  high-level  organization  of  the  model  and  to  fill  in  the  details  of  each  box  using 
one  or  a  series  of  dialogue  sheets.  The  boxes  include  inputs,  outputs,  memory  buffers, 
processing  steps,  and  even  production  systems  as  components. 

COGENT ’s  strengths  are  that  it  is  easy  to  teach,  the  displays  provide  useful  summaries 
of  the  model  that  help  with  explanation  and  development,  and  the  environment  is  fairly 
complete.  It  appears  possible  to  reuse  components  on  the  level  of  boxes.  COGENT’s 
weaknesses  are  that  it  is  fairly  unconstrained;  for  large  systems  it  may  be  unwieldy;  and  it 
might  not  interface  well  to  external  simulations. 

COGENT  also  shows  that  cognitive  modeling  environments  can  al  least  appear  more 
friendly.  The  results  of  its  graphic  interface  routinely  appear  in  talks  as  model  summaries. 
The  interface  is  also  quite  encouraging  to  users,  allowing  them  to  feel  that  they  can  start 
working  immediately. 
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5.3.6  Hybrid  Architectures 

Hybrid  architectures  arc  architectures  that  typically  include  symbolic  and  non-symbolic 
elements.  A  more  general  definition  would  be  architectures  that  include  major  components 
from  multiple  architectures. 

Hybrid  architectures  are  mentioned  briefly  by  Pew  and  Mavor  (1998,  pp.  108-110). 
Work  has  continued  in  this  area  with  some  interesting  results.  1 JCAI  (Kitajima  &  Poison, 
1996;  Kitajima,  Soto,  &  Poison,  1998),  for  example,  models  how  people  explore  and  use 
interfaces  based  on  a  theory  of  how  Kintsch's  (1998)  schemas  receive  activation.  The  U.S. 
Office  of  Naval  Research  (ONR)  has  sponsored  a  research  program  on  hybrid  arehiteetures 
(Giglcy  &  Chipman,  1 999).  This  has  given  rise  to  some  new,  interesting  hybrid  architectures 
(e.g.,  Sun,  Merrill,  &  Peterson,  1998;  Wang,  Jolinson,  &  Zhang,  1998). 

Perhaps  the  most  promising  hybrids  are  melding  perception  components  across 
cognitive  architectures.  The  HPIC  (Kieras  &  Meyer,  1997)  architecture’s  perception  and 
action  component  has  been  merged  with  ACT-R’s  perceptual-motor  component,  ACT- 
R/PM  (Byrne,  2001;  Byrne  &  Anderson,  1998)  and  with  Soar  (Chong,  2001).  This  has  led 
to  direct  reuse  and  unification.  Similar  results  have  been  found  with  the  Nottingham 
functional  interaction  architecture  being  used  by  Soar  and  ACT-R  models  (Bass  et  al.,  1995; 
Baxter  &  Ritter,  1996;  Ritter  et  al.,  2000;  G.  Jones  et  al.,  2000). 

5.4  Knowledge-Based  Systems  and  Agent  Architectures 

Agent  architectures  will  be  important  within  synthetic  environments  for  modeling 
autonomous  vehicles  and  for  exploring  the  doctrine  of  autonomous  vehicles.  Most 
principled  agent  architectures  have  historical  roots  in  distributed  artificial  intelligence.  For 
several  decades,  distributed  Al  has  been  tackling  essentially  the  same  problem  as 
Knowledge-Based  Systems  (KBS)  research,  namely,  how  to  produce  efficient  problem¬ 
solving  behavior  in  software.  The  main  concept  that  brings  agency  and  KBS  together  is  the 
idea  of  operation  at  the  knowledge  level  as  described  by  Newell  ( 1 982). 

The  behavioral  law  used  by  an  observer  to  understand  the  agent  at  the  knowledge  level 
is  the  principle  of  maximum  rationality  (Newell,  1982),  which  states,  “If  an  agent  has 
knowledge  that  one  of  its  actions  will  lead  to  one  of  its  goals,  then  the  agent  will  select  that 
action.”  The  modeling  of  intelligent  artificial  systems  at  the  knowledge  level,  that  is,  with 
no  reference  to  details  of  implementation,  is  a  key  principle  in  KBS  construction.  It  is  also  at 
the  heart  of  many  assumptions  in  the  tradition  of  explaining  human  behavior. 

Nwana  (1996)  claims  that  an  important  difference  between  agent-based  applications  and 
other  distributed  computing  applications  is  that  agent-based  applications  operate  typically  at 
the  knowledge  level,  whereas  distributed  computing  applications  operate  at  the  symbol 
level.  At  the  symbol  level,  the  entity  is  seen  simply  as  a  mechanism  acting  over  symbols, 
and  its  behavior  is  described  in  these  terms. 

The  theoretical  links  between  the  motivations  behind  KBS  and  agent  research  can  be 
seen  in  the  main  approaches  taken  to  the  definition  of  software  agency.  Ascriptional  agency 
attempts  to  create  convincing  human-like  behaviors  in  software  in  the  belief  that  this  will 
produce  programs  that  are  easy  to  interact  with.  This  work  can  be  seen  as  paralleling  the 
expert  behavioral  modeling  approach  that  is  currently  widely  espoused  in  the  KBS 
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community.  The  Belief-Dcsire-lntention  (BDl)  agents  focus  on  the  concept  of 
intentionality — the  mental  attitudes  of  the  agent.  BDI  models  have  been  successfully 
implemented  in  systems  such  as  the  DESIRE  framework  (Brazier,  Dunin-Keplicz,  Treur,  & 
Verbrugge,  1999)  and  the  JAVA  Agent  Compiler  and  Kernel  (JACK)  component  system 
(Busetta,  Howden,  Ronnquist,  &  Hodgson,  1999a;  Busetta,  Ronnquist,  Hodgson,  & 
Lucas,  1999b). 

JACK  is  an  extension  to  JAVA.  It  includes  a  JAVA  library  and  a  compiler  that  takes  a 
JAVA  program  with  embedded  JACK  statements.  A  JAVA  compiler  expands/incorporates 
the  JACK  statements  to  create  a  runnable  JAVA  program.  These  statements  implement  a 
BDI  architecture,  while  allowing  JAVA  statements  to  extend  and  implement  them.  The 
statements  include  commands  like  @achieve(conc/ifiofj,  event),  which  subgoals  on  event  if 
condition  is  not  found  to  be  true. 

The  resulting  program  instantiates  a  BDI  agent.  Its  BDI  architecture  is  made  up  of 
beliefs  represented  with  a  database;  desires  represented  as  events  that  can  trigger  plans;  and 
intentions  represented  through  these  plans.  For  example,  a  fact  may  come  in  from 
perception  and  match  a  desire,  that  of  putting  new  facts  into  the  database.  This  may  result  in 
further  desires  being  matched  and  intentions  (plans)  leading  to  behaviors.  Further 
information  is  available  at  the  JACK  developer’s  website  (www.agent-software.com.au). 

Reviews  of  the  agent  literature  (Etzioni  &  Weld,  1995;  Franklin  &  Graesser,  1997; 
Wooldridge  &  Jennings,  1995)^  reveal  that,  when  attempting  to  define  agency  as  dependent 
on  the  possession  of  a  set  of  cardinal  attributes,  many  of  the  attributes  suggested  could  also 
be  seen  as  characteristic  of  behavior  that  is  best  explained  at  the  knowledge  level.  These 
include  abstraction  and  delegation,  flexibility  and  opportunism,  task  orientation,  adaptivity, 
reactivity,  autonomy,  goal-directedness,  flexibility,  collaborative  and  self-starting  behavior, 
temporal  continuity,  knowledge-level  communication  ability,  social  ability,  and  cooperation. 

Both  agent  systems  and  KBSs  are  moving  in  the  direction  of  modular  components  of 
expertise  as  a  response  to  the  problems  of  knowledge  use  and  reuse  to  promote  intelligent 
behavior  in  software.  Domain  ontologies  form  a  significant  subset  of  these  KBS 
components.  Increasingly,  multi-agent  systems  are  being  produced  that  use  such  domain 
ontologies  to  facilitate  agent  communication  at  the  knowledge  level,  for  example,  the  agent 
network  created  as  part  of  the  Infosleuth  architecture  (Jacobs  &  Shea,  1 996).  Some  agent 
systems  also  draw  explicitly  on  models  of  problem-solving  expert  behavior  developed  in 
KBS  research.  The  internet-based  Multi-agent  Problem  Solving  (IMPS)  architecture  (Crow 
&  Shadbolt,  1998)  uses  software  agency  as  a  medium  for  applying  model-driven  knowledge 
engineering  techniques  to  the  internet.  It  involves  software  agents  that  can  conduct 
structured  online  knowledge  acquisition  using  distributed  knowledge  sources.  Agent¬ 
generated  domain  ontologies  are  used  to  guide  a  flexible  system  of  autonomous  agents 
driven  by  problem-solving  models. 


For  online  information  about  examples  and  related  U  S.  programs,  see  www.darpa.mil/ito/ResearchAreas.html  and 
www.nosc.mil/robols/air/amgsss/mssmp.html. 
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5.5  Architectural  Ideas  Behind  the  Sim^Agent  Toolkit^ 

Since  the  early  1970s,  Slonian  and  his  colleagues  have  been  attempting  to  develop 
requirements  and  designs  for  an  architecture  capable  of  explaining  a  wide  variety  of  facts 
about  human  beings  and  other  intelligent  agents.  Sloman’s  ideas  about  cognitive 
architectures  and  the  theoretically  based  agent  architecture  toolkit  (Sim_Agent)  provide 
useful  lessons  about  architectural  toolkits  and  about  process  models  of  emotions.  Further 
information  is  available  at  the  CogAff  website  (www.cs.bham.ac.uk/--axs/cogaff.html). 

5.5.1  Cognition  and  Affect 

A  human-like  information  processing  architecture  includes  many  components 
perfonning  different  functions  all  of  which  operate  in  parallel,  asynchronously.  This  is  not 
the  kind  of  low-level  parallelism  found  in  neural  nets  (although  such  neural  mechanisms  arc 
part  of  the  infrastructure).  Rather  there  seem  to  be  many  functionally  distinct  modules 
perfonning  different  sorts  of  tasks  concurrently,  a  significant  proportion  of  them  are 
concerned  with  the  monitoring  and  control  of  bodily  mechanisms,  for  example,  posture, 
saccades,  grasping,  temperature  control,  daily  rhythms,  and  so  on. 

The  very  oldest  mechanisms  in  the  human  architecture  arc  probably  all  reactive  in  the 
sense  described  in  various  recent  papers  (c.g.,  Sloman,  2000).  The  key  feature  of  reactivity 
is  the  lack  of  “what-if‘  reasoning  capabilities,  with  all  that  entails,  including  the  lack  of 
temporary  workspaces  for  representations  of  hypothesized  futures  (or  past  episodes);  the 
lack  of  mechanisms  for  stored  factual  knowledge  (generalizations  and  facts  about 
individuals)  to  support  the  generation  of  possible  futures,  possible  actions,  and  likely 
consequences  of  possible  actions;  and  the  lack  of  mechanisms  for  manipulating 
explicit  representations. 

Both  reactive  and  deliberative  mechanisms  require  perceptual  input  and  can  generate 
motor  signals.  However,  to  function  effectively,  both  perceptual  and  action  subsystems  may 
have  evolved  new  layers  of  abstraction  to  support  the  newer  deliberative  processes,  for 
example,  by  categorizing  both  observed  objects  and  events  at  a  higher  level  of  abstraction, 
and  allowing  higher-level  action  instructions  to  generate  behavior  in  a  hierarchically 
organized  manner.  More  generally,  different  subsystems  use  information  for  different 
purposes  so  that  a  number  of  different  processes  of  analysis  and  interpretation  of  sensory 
input  occur  in  parallel,  extracting  different  affordanccs  from  raw  data  from  the  optic  array. 
Recent  work  by  brain  scientists  on  ventral  and  dorsal  visual  pathways  are  but  one 
manifestation  of  this  phenomenon. 

I'he  interactions  between  reactive  and  deliberative  layers  arc  complex  and  subtle, 
especially  as  neither  is  in  charge  of  the  other,  though  at  times  either  ean  dominate. 
Moreover,  the  division  is  not  absolute:  information  in  the  deliberative  system  can  sometimes 
be  transferred  to  the  reactive  system  (e.g.,  via  drill  and  practice  learning),  and  infonnation  in 
the  reactive  system  can  sometimes  be  decompiled  and  made  available  to  deliberative 
mechanisms  (though  this  is  often  highly  error-prone). 
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For  reasons  explained  in  various  papers  available  in  the  CogAff  FTP  site,  it  is  possible 
to  eonjecture  that  at  a  much  later  evolutionary  stage  a  third  class  of  mechanism  developed, 
again  using  and  redeploying  mechanisms  that  had  existed  previously.  The  new  type  of 
mechanism,  which  has  been  provisionally  labeled  “meta-management,”  provides  the  ability 
to  do  for  internal  processes  what  the  previous  mechanisms  did  for  external  processes: 
namely  it  supports  monitoring,  evaluation,  and  control  of  other  internal  processes,  including, 
for  instance,  thinking  about  how  to  plan,  or  planning  better  ways  of  thinking.  For  example,  a 
deliberative  system  partly  driven  by  an  independent  reactive  system  and  sensory 
mechanisms  can  unexpectedly  acquire  inconsistent  goals.  A  system  with  meta-management 
can  notice  and  categorize  such  a  situation,  evaluate  it,  and  perhaps  through  deliberation  or 
observation  over  an  extended  period,  develop  a  strategy  for  dealing  with  such  conflicts. 

Similarly,  meta-management  can  be  used  to  detect  features  of  thinking  strategies  and, 
perhaps  in  some  cases,  notice  flaws  or  opportunities  for  improvement.  Such  a  mechanism 
(especially  in  conjunction  with  an  external  language)  also  provides  a  route  for  absorption  of 
new  internal  processes  from  a  culture,  thereby  allowing  transmission  between  generations  of 
newly  acquired  information  without  having  to  wait  for  new  genetic  encodings  of  that 
information  to  evolve.  Through  internal  monitoring  of  sensory  buffers,  the  extra  layer  adds  a 
kind  of  self-awareness  that  has  been  the  focus  of  discussions  of  consciousness,  subjective 
experience,  qualia,  etc.  As  with  external  processes,  the  monitoring,  evaluation,  and  re¬ 
direction  of  internal  processes  is  neither  perfect  nor  total  and,  as  a  result,  mistakes  can  be 
made  about  what  is  going  on,  inappropriate  evaluations  of  internal  states  can  occur,  and 
attempts  to  control  processing  may  fail,  for  example,  when  there  are  lapses  of  attention 
despite  firm  intentions. 

Another  feature  of  meta-management  is  its  ability  to  be  driven  by  different  collections 
of  beliefs,  attitudes,  strategies,  and  preferences,  in  different  contexts,  explaining  how  a 
personality  may  look  different  at  home,  driving  a  car,  in  the  office,  etc.  Besides  the  three 
main  concurrent  processing  layers  (reactive,  deliberative,  and  meta-management)  identified 
above  that  others  have  found  evidence  for,  a  number  of  additional  specialized  mechanisms 
are  needed,  including:  mechanisms  for  managing  short-  and  long-term  goals,  a  variety  of 
long-  and  short-term  memory  stores,  and  one  or  more  global  alarm  systems  capable  of 
detecting  a  need  for  rapid  global  re-organization  of  activity  (freezing,  fleeing,  attacking, 
becoming  highly  attentive,  etc.),  and  also  producing  that  re-organization. 

For  instance,  whereas  many  people  have  distinguished  primary  and  secondary  emotions 
(e.g.,  Damasio,  1994),  Sloman  and  his  colleagues  have  proposed  a  third  type,  tertiary 
emotions,  sometimes  referred  to  as  perturbances  (Sloman,  1998a;  Sloman  &  Logan,  1999). 
Primary  emotions  rely  only  on  the  reactive  levels  in  the  architecture.  Secondary  emotions 
require  deliberative  mechanisms.  Tertiary  emotions  are  grounded  in  the  activities  of  meta¬ 
management,  including  unsuccessful  meta-management.  There  are  other  affective  states 
concerned  with  global  control,  such  as  moods,  which  also  have  different  relationships  to  the 
different  layers  of  processing.  Many  specific  states  that  are  often  discussed  but  very 
unclearly  defined,  such  as  arousal,  can  be  given  much  clearer  definitions  within  the 
framework  of  an  architecture  that  supports  them. 

It  looks  as  if  various  subsets  of  the  capabilities  described  here  arising  out  of  the  three 
layers  and  their  interactions  can  be  modeled  in  the  architectures  developed  so  far,  for 
example.  Soar,  ACT-R/PM,  Moffatt  and  Frijda’s  Will  architecture  (2000),  and  the  various 
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logic-based  models  that  dominate  the  ATAL  (Architectures,  1  heorics  and  Languages)  series 
of  workshops  (e.g.,  Wooldridge,  Mueller,  &  Tambe,  1996,  also  see  mas.cs.uniass.edu/atal/), 
and  books  like  Wooldridge  and  Rao  (1999). 

However,  only  small  subsets  of  these  capabilities  can  be  modeled  at  present.  Any 
realistic  model  of  human  processing  needs  to  be  able  to  cope  with  contexts  including  rich 
bombardment  with  multi-modal  sensory  and  linguistic  infonnation;  where  complex  goals 
and  standards  of  evaluation  are  constantly  interacting;  where  things  often  happen  too  fast  for 
fully  rational  deliberation  to  be  used;  where  everything  that  occurs  docs  not  always  fall  into 
a  previously  learned  category  for  which  a  standard  appropriate  response  is  already  known; 
where  decisions  have  to  be  taken  on  the  basis  of  incomplete  or  uncertain  information;  and 
where  the  activity  of  solving  one  problem  or  carrying  out  one  intricate  task  can  be  subverted 
by  the  arrival  of  new  factual  information,  new  orders,  or  new  goals  generated  intenially  as  a 
side-effect  of  other  processes. 

Where  the  individual  is  also  driving  a  fast-moving  vehicle  or  is  under  fire  then  it  is  very 
likely  that  a  huge  amount  of  the  processing  going  on  will  involve  the  older  reactive 
mechanisms,  including  many  concerned  with  bodily  control  and  visual  attention.  It  may  be 
some  time  before  we  fully  understand  the  implications  of  such  total  physical  immersion  in 
stressful  situations,  including  the  effects  on  deliberative  and  meta-management  processes. 
(For  example,  fixing  attention  on  a  hard  planning  problem  can  be  difficult  if  bombs  arc 
exploding  all  around  you.  Can  our  models  explain  why?) 

5.5.2  Sim_Agent  and  CogAff 

Sloman  and  his  colleagues’  general  architectural  toolkit,  the  Sim_Agcnt  Toolkit,  allows 
them  to  explore  a  variety  of  new  ideas  about  complex  architectures.  It  is  not  an  architecture, 
but  a  steadily  developing  toolkit  for  exploring  architectures. 

The  CogAff  architecture  provides  a  schema,  based  on  a  3  by  3  grid  that  provides  a 
framework  for  describing  specific  architectures  according  to  the  grid  components  present, 
their  control  relationships,  and  how  information  flows  between  them.  H-CogAff  is  a  specific 
human-like  version  that  is  a  particularly  rich  special  case.  Other  special  cases  include 
various  kinds  of  purely  reactive  (i.c.,  non-deliberative)  architectures  (perhaps  insects  or 
reptiles).  Brooks’  subsumption  architectures,  purely  deliberative  architectures  (lots  of  old  AI 
systems,  early  versions  of  Soar  and  ACT),  and  so  on. 

Sloman  and  his  colleagues  also  wanted  a  toolkit  that  supported  exploration  of  a  number 
of  interacting  agents  (and  physical  objects,  etc.)  where  each  agent  has  a  variety  of  very 
different  mechanisms  running  concurrently  and  asynchronously,  yet  innucncing  one 
another.  They  also  wanted  to  be  able  to  very  easily  change  the  architecture  within  an  agent, 
change  the  degree  and  kind  of  interaction  between  components  of  an  agent,  and  speed  up  or 
slow  down  the  processing  of  one  or  more  sub-mechanisms  relative  to  others  (Sloman, 
1998b).  In  particular,  they  wanted  to  be  able  to  easily  combine  different  types  of  symbolic 
mechanisms  and  also  sub-symbolic  mechanisms  within  one  agent.  The  toolkit  was  also 
required  to  support  rapid  prototyping  and  interactive  development  with  close  connections 
between  internal  processes  and  graphic  displays. 

Because  other  toolkits  did  not  appear  to  have  the  required  flexibility  and  tended  to  be 
committed  to  a  particular  type  of  architecture,  Sloman  and  his  colleagues  built  their  own 
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toolkit,  which  has  been  used  for  some  time  at  the  University  of  Birmingham  and  DERA, 
Malvern.  Their  toolkit  is  described  briefly  in  Sloman  and  Logan  (1999)  and  in  more 
detail  in  the  online  documentation  at  the  Birmingham  Poplog  FTP  site 
(ftp.cs.bham.ac.uk/pub/dist/poplog/).  The  code  and  documentation  arc  freely  available  online. 
The  toolkit  runs  in  Pop-1 1  in  the  Poplog  system  (inherently  a  multi-language  AI  system,  so  that 
code  in  Prolog,  Lisp,  or  ML  can  also  be  included  in  the  same  process).  Poplog  has  become 
freely  available  (www.cs.bham.ac.uk/research/poplog/freepoplog.html). 

At  present,  Sloman  does  not  propose  a  specific  overarching  architecture  as  a  rival  to 
systems  like  Soar  or  ACT-R.  He  feels  that  not  enough  is  yet  known  about  how  human  minds 
work  and,  consequently,  any  theory  proposing  the  architecture  is  premature.  Instead,  he  and 
his  group  have  been  exploring  and  continually  refining  a  collection  of  ideas  about  possibly 
relevant  architectures  and  mechanisms.  Although  the  ideas  have  been  steadily  developing, 
Sloman  and  his  colleagues  do  not  believe  that  they  are  near  the  end  of  this  process.  So, 
although  one  could  use  a  label  like  CogAff  to  refer  to  the  general  sort  of  architecture  they  are 
currently  talking  about,  it  is  not  a  label  for  a  fixed  design.  Rather  CogAff  should  be  taken  to 
refer  to  a  high-level  overview  of  a  class  of  architectures  in  which  many  details  still  remain 
unclear.  The  CogAff  ideas  are  likely  to  change  in  dramatic  ways  as  more  is  learned  about 
how  brains  work,  about  ways  in  which  they  can  go  wrong  (e.g.,  as  a  result  of  disease,  aging, 
brain  damage,  addictions,  stress,  abuse  in  childhood,  etc.),  and  how  brains  differ  from  one 
species  to  another,  or  one  person  to  another,  or  even  within  one  person  over  a  lifetime. 

The  toolkit  is  still  being  enhanced.  In  the  short  term,  they  expect  to  make  it  easier  to 
explore  architectures  including  meta-management.  Later  work  will  include  better  support 
for  sub-symbolic  spreading  activation  mechanisms  and  the  development  of  more  reusable 
libraries,  preferably  in  a  language-independent  form. 

5.5.3  Summary 

The  Sim_Agent  toolkit  and  the  goals  its  developers  have  for  it  have  some  commonalties 
with  other  approaches.  The  need  for  a  library  of  components  is  acknowledged.  They 
emphasize  that  reactive  behaviors  are  necessary  and  desirable,  and  that  the  emotional 
aspects  arise  out  of  the  reactive  mechanisms.  It  provides  a  broad  range  of  support  for  testing 
and  creating  architectures.  The  toolkit  provides  support  for  reflection  as  a  type  of  meta- 
Icaming.  Other  architectures  will  need  to  support  reflection  as  well,  particularly  where  the 
world  is  too  fast-paced  for  learning  to  occur  during  the  task  (John,  Vera,  &  Newell,  1994; 
Nielsen  &  Kirsner,  1994). 

The  features  that  the  toolkit  supports  help  define  a  description  of  architectural  types. 
The  capabilities  that  can  be  provided,  from  perception  to  action  and  from  knowledge  to 
emotion,  provide  a  way  of  describing  architectures. 

The  major  drawback  is  that  none  of  the  models  or  libraries  created  in  Sim_Agent  have 
been  compared  with  human  data  directly.  In  defense  of  this  lack  of  comparison,  Sloman 
claims  that  the  more  complex  and  realistic  an  architecture  becomes,  the  less  sense  it  makes 
to  test  it  directly.  Instead,  he  claims  that  the  architecture  has  to  be  tested  by  the  depth  and 
variety  of  the  phenomena  it  can  explain,  like  advanced  theories  in  physics,  which  also 
cannot  be  tested  directly. 
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5.6  Engineering-Based  Architectures  and  Models 

There  is  a  history  of  studying  process  control  in  and  near  industrial  engineering  that 
includes  studying  human  operators.  This  approach  is  not  (yet)  pari  of  inaiiistream 
psychology,  and  Pew  and  Mavor  (1998)  do  not  make  many  references  to  work  in  this  field. 

If  tank  operators  and  ship  captains  can  be  viewed  as  running  a  process,  and  we  believe 
they  can,  there  is  a  wide  range  of  behavioral  regularities  referenced  and  modeled  in 
engineering  psychology  that  can  be  generalized  and  applied  to  other  domains.  Major 
contributions  in  this  area  include  Reason’s  (1990)  book  on  errors,  Rasmussen’s  skill 
hierarchy  (1983),  the  Cognitive  Reliability  and  Error  Analysis  Method  (CREAM) 
methodology  for  analyzing  human  performance  (Hollnagel,  1998),  and  numerous  studies 
characterizing  the  strengths  and  weaknesses  of  human  operator  behavior  (de  Keyser  & 
Woods,  1990;  Sanderson,  McNeese,  &  Zaff,  1994). 

Engineers  have  also  created  intelligent  architectures.  These  architectures  have  almost 
exclusively  been  used  to  create  models  of  users  of  complex  machinery,  ranging  from 
nuclear  power  plants  to  airplanes.  The  models  are  often,  but  not  always,  lied  to  simulations 
of  those  domains.  Their  approach  is  generally  more  practical.  They  are  more  interested  in 
approximate  liming  and  the  overt  behavior  than  in  detailed  mechanisms.  These  developers 
appear  to  be  less  interested  in  the  internal  meehanisms  giving  rise  to  behavior  as  long  as  the 
model  is  usable  and  approximately  correct. 

These  models  of  operators  include  models  of  nuclear  power  plant  operators,  the 
Cognitive  Simulation  Mcxlel  (COSIMO;  Cacciabue  et  al.,  1992),  and  the  Cognitive 
Environment  Simulation  (CES;  Woods,  Roth,  &  Pople,  1987).  AfDE  (Amalberti  &  Deblon, 
1992)  is  a  model  of  fighter  pilot  behavior;  the  Step  Ladder  Model  or  Skill-based,  Rule- 
based,  Knowledge-based  model  is  a  generally  applicable  framework,  originally  formulated 
in  electronics  troubleshooting  (c.g.,  Rasmussen,  1983). 

We  will  also  look  at  a  few  operator  models  in  more  detail. 

5.6.1  APEX4 

APEX  (Freed  &  Remington,  2000;  Freed  et  al.,  1998;  John  et  al.,  2002)  is  a  set  of  tools 
for  simulating  human  performance  when  interacting  with  interfaces  to  perfonu  tasks  similar 
to  MIDAS  (Laughery  &  Corker,  1997).  The  main  driver  for  APEX  is  the  need  to  model 
behavior  in  environments,  such  as  air  traffic  control  and  commercial  jet  flight  decks,  and  to 
help  engineers  design  usable  systems  in  these  domains 

Powerful  action-selection  mechanisms  of  the  sort  developed  by  artificial  intelligence 
researchers  are  used  to  cope  adaptively  with  time-pressure  and  uncertainty,  and  to 
coordinate  the  execution  of  multiple  tasks  (i.e.,  strategic  multi-tasking).  Usability  is  taken 
very  seriously  (Freed  &  Remington,  2000).  A  high-level  modeling  language  is  included. 
Applications  of  APEX  have  included  time-analysis  of  skilled  behavior,  partially-automated 


^  Comments  from  Michael  Freed  were  helpful  In  preparing  this  section. 
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human-factors  design  analysis,  and  creation  of  artificial  human  participants  in 
large-scale  simulations. 

This  general  approach  has  proven  successful  in  allowing  APEX  to  automate  the  CPM- 
GOMS  HCI  analysis  method  (John  &  Kicras,  1996)  and  for  reconstructing  incidents 
involving  human  error  in  a  way  that  promise  eventual  error-prediction  capabilities.  As  much 
as  it  implements  CPM-GOMS,  it  inherits  CPM-GOMS’  empirical  support.  Consistent  with 
the  needs  of  domains  in  which  APEX  has  been  most  frequently  used,  the  action-selection 
architecture  emphasizes  capabilities  having  to  do  with  multi-task  management,  especially 
regarding  concurrency  control  and  strategic  task  management. 

APEX  was  created  by  Freed  as  part  of  his  doctoral  dissertation  and  continues  to  be 
developed  by  researchers  at  NASA  Ames  Research  Center  and  elsewhere.  They  are 
explicitly  concerned  about  building  a  community  of  users  to  share  ideas  and  models.  Further 
information,  including  APEX  itself,  is  available  through  search  engines. 

APEX  is  probably  best  described  as  an  engineering  model  because  it  has  been  designed 
to  serve  engineering  goals.  APEX  is  interesting  because  it  models  the  whole  operator,  from 
perception  to  action,  and  the  model  often  interacts  with  fairly  complete  and  complex 
simulations,  and  can  make  very  detailed  predictions  easily.  It  does  not  yet  include  learning, 
and  the  complex  results  past  CPM-GOMS  could  be  tested  more,  but  the  full  toolset  suggests 
that  interface  design  tools  based  on  cognitive  models  are  now  possible. 

5.6.2  Simplified  Model  of  Cognition  and  Contextual  Control  Model 

The  Simplified  Model  of  Cognition  (SMoC)  (Hollnagel  &  Cacciabue,  1991)  is  an 
extension  of  Neisser’s  (1976)  perceptual  cycle  and  describes  cognition  in  terms  of  four 
essential  elements:  (1)  observation/identification,  (2)  interpretation,  (3)  planning/selection, 
and  (4)  action/execution.  Although  these  are  normally  linked  in  a  serial  path,  other  links  arc 
possible  between  the  various  elements.  The  small  number  of  cognitive  functions  in  SMoC 
reflects  the  general  consensus  of  opinion  that  has  developed  since  the  1950s  on  the 
characteristics  of  human  cognition.  The  fundamental  features  of  SMoC  are  the  distinction 
between  observation  and  inference  (overt  vs.  covert  behavior),  and  the  cyclical  nature  of 
cognition  (cf.  Neisser,  1976). 

SMoC  was  formulated  as  part  of  the  System  Response  Generator  (SRG)  project 
(Hollnagel  &  Cacciabue,  1991).  SRG  was  a  software  tool  developed  to  study  the  effect  of 
human  cognition  (specifically  actions  and  decision  making)  on  the  evolution  of  incidents  in 
complex  systems. 

The  Contextual  Control  Model  (CoCoM;  Hollnagel,  1993)  is  an  extension  of  the  SMoC, 
and  addresses  the  issues  of  modeling  both  competence  and  control.  In  most  models  the  issue 
of  competence  is  supported  by  a  set  of  procedures  or  routines  that  can  be  employed  to 
perform  a  particular  task  when  a  particular  set  of  pre-defmed  conditions  obtains.  CoCoM 
further  proposes  that  there  are  four  overlapping  modes  of  control — influeneed  by  knowledge 
and  skill  levels — that  also  influence  behavior: 
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•  Scrambled  control:  where  the  selection  of  the  next  action  is  unpredictable.  This  is 
the  lowest  level  of  control. 

•  Opportunistic  control:  where  the  selection  of  the  next  action  is  based  on  the  current 
context  without  reference  to  the  current  goal  of  the  task  being  performed. 

•  Tactical  control:  where  performance  is  based  on  some  fonii  of  planning. 

•  Strategic  control:  where  performance  takes  full  account  of  higher-level  goals.  This 
is  the  highest  level  of  control. 

The  transition  between  control  modes  depends  on  a  number  of  factors,  particularly  the 
amount  of  subjectively  available  time  and  the  outcome  of  the  previous  action.  These  two 
factors  arc  interdependent,  however,  and  also  depend  on  aspects  such  as  the  task  complexity 
and  the  current  control  mode. 

CoCoM  has  been  used  in  the  development  of  the  CREAM  (Hollnagcl,  1998)  within  the 
field  of  human  reliability  analysis.  The  CREAM  is  a  method  for  analyzing  human 
performance  when  working  with  complex  systems.  It  can  be  used  in  both  the  retrospective 
analysis  of  accidents  and  events,  and  in  predicting  performance  for  human  reliability 
assessment.  Extending  the  CREAM  is  presented  below  as  a  useful  project. 

5.6.3  Summary 

These  engineering-based  architectures  suggest  that  engineering  models  can  provide 
useful  behavior  even  when  the  internal  mechanisms  are  not  fully  tested  or  perhaps  even 
plausible.  These  architectures  suggest  that  some  of  the  difficulty  in  creating  the  architectures 
is  due  to  the  implicit  and  explicit  knowledge  that  psychologists  bring  with  them  regarding 
plausibility.  We  believe  psychologists’  domain  knowledge  leads  to  more  accurate  models 
but  slower  development. 

5.7  Summary  of  Recent  Developments  for  Modeling  Behavior 

This  chapter  has  reviewed  several  architectures.  I'hesc  architectures  and  their 
applications  show  that  it  is  becoming  increasingly  possible  to  create  plausible  and  useful 
architectures  based  on  a  variety  of  approaches. 

An  agreed,  formal  scheme  for  classifying  architectures  would  he  useful.  This  ideal 
system  classification  would  note  the  sorts  of  tasks  that  each  architecture  performs  best, 
supporting  users  to  choose  an  arehitecture  for  a  partieular  task.  The  best  that  we  have  found 
is  Table  3.1  in  Pew  and  Mavor  (1998,  pp.  98-105).  Our  Table  5.1  provides  a  summary  of  the 
architectures  presented  here  in  that  same  format  as  a  supplement  to  their  tabic.  We  have 
included  all  relevant  infonnation  of  which  we  are  aware  for  each  architecture.  In  most  cases 
the  developers  of  the  architectures  have  helped  complete  their  entry  in  this  tabic.  Another 
approach  for  classifying  architectures  is  available  from  Logan  (1998). 

Developments  in  AI  continue  to  be  useful.  The  general  A1  methods  discussed  arc  not 
included  in  this  table  because  they  arc  not  broad  enough  to  be  considered  a  cognitive 
architecture,  but  they  are  likely  to  be  useful  additions  to  architectures,  either  directly  or 
indirectly.  For  example,  genetic  algorithms  have  been  included  in  a  proposed  architecture 
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(Holland,  Holyoak,  Nisbett,  &  Thagard,  1986),  and  planning  algorithms  have  been  ineluded 
as  adjunets  to  Soar  (Grateh,  1998).  These  developments  will  help  extend  arehiteetures  by 
providing  algorithms  for  inelusion  within  arehiteetures,  partieularly  hybrid  arehiteetures. 

There  are  several  interesting  trends  to  note.  One  is  that  the  diversity  of  arehiteetures  is 
not  deereasing.  New,  fundamental  ideas  on  whieh  to  base  arehiteetures  has  widened  from 
simply  problem  solving.  For  example,  EPAM  is  based  on  pattern  reeognition,  and  PSI  and 
arehiteetures  ereated  in  the  Sim  Agent  Toolkit  are  based  on  ideas  about  emotions. 

Another  interesting  trend  is  that  some  aspeets  of  the  arehiteetures  are  starting  to  merge 
and  be  reused.  The  interaetion  aspeets  of  EPIC  have  been  reused  by  Soar  and  by  ACT-R. 
The  Nottingham  Interaetion  Arehiteeture  is  similar  in  some  ways  and  getting  similar  reuse 
(e.g.,  Jones  et  al.,  2000).  These  strands  are  beeoming  quite  similar  to  eaeh  other  (Byrne, 
Chong,  Freed,  Ritter,  &  Gray,  1999)  and  are  quite  likely  to  merge  in  the  future. 

The  importanee  of  model  usability  is  beeoming  more  reeognized.  COGENT  provides  an 
example  of  how  easy  a  modeling  tool  should  be  to  piek  up  and  use.  Similar  developments 
with  Soar  and  ACT-R  are  starting  to  emphasize  reusable  eode,  better  doeumentation,  and 
better  tutorial  materials.  Other  arehiteetures  will  have  to  follow  suit  to  attraet  users  and  to 
train  and  support  their  existing  users.  Newell  (1990)  wrote  about  the  entry  level  (the  bar) 
being  raised  as  arehiteetures  develop  through  eompetition.  It  is  interesting  that  usability  is 
perhaps  the  first  elear  eomparison  level. 
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Table  5.1:  Comparison  of  Architectures 


Submodels 


Architecture 
1  EPAM 


Original  purpose_ Sensing  and  perception 

Model  high-level  perception.  Visual,  auditory  perceptual  discrimination  in 

learning,  and  memory  real-time  (assuming  feature-based  description 

of  objects) 


2  SOM 


Simulation  of  cerebellum  as  a  Can  be  used  to  recall  the  nearest  stored 

content-addressable  memory  memory  to  any  encoded  perceptual  input 


3  PSI 


Explores  interaction  of  cognition.  Optical  perception  by  “Hypercept"  process  that 
motivation,  and  emotion  to  build  an  scans  (simulated)  environment  for  basic 
integrated  model  of  human  action  features.  Raises  hypotheses  about  sensory 
regulation  schemas  to  which  features  may  belong  and 

tests  these  hypotheses  by  subsequent 
scanning  of  environment  (comparable  to 
saccadic  eye-movements).  If  pattern  not 
recognizable,  new  schema  generated 


4  COGENT 


Design  environment  for  modeling  Input  buffers  that  can  be  modified  to  represent 
cognitive  processes  vision  and  hearing 


5  JACK  as  example  of  BDI  Constitute  an  industrial-strength  JAVA  methods  +  inter-agent  messaging 
architectures  framework  for  agent  applications 


6  SIm  Agent  Toolkit  Explores  architectures  using  rapid  Defined  by  methods  for  each  agent  class. 

prototyping 


7  Engineering-based 
models  (e  g.,  APEX) 


Provide  models  of  humans  in  control  Varies,  but  exists  for  most  models 
loops 
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Table  5.1:  Comparison  of  Architectures  (continued) 


Submodels 

Working/ 

Short-Term  memory 

Long-Term  Memory 

Motor  Outputs 

4-7  slot  STM;  in  some 
versions  (e.g.,  EPAM-IV), 
more  detailed 
implementation  of  auditory 
(Baddeley-like)  STM  &  visual 
STM 

Discrimination  net.  In  recent  versions, 
nodes  of  discrimination  net  used  to 
create  semantic  net  and  productions 

Eye  movements,  simple  drawing 
behavior 

Not  modeled 

Sparse  Distributed  Memory  models 
related  to  PDP  and  neural-net 
memory  models 

Motor  sequences  can  be  learned. 
Nearest  match  memories  can  be 
sequences  that  could  be 
behaviors 

The  head  of  a  protocol 
memory  that  permanently 
makes  a  log  of  actions  and 
perceptions 

The  remnants  of  logs  decay  with  time. 
Strings  of  logs  associated  with  need 
satisfaction  or  with  pain  will  be 
reinforced  and  have  a  greater  chance 
to  survive  and  form  a  part  of  long-term 
memory  than  neutral  sequences  of 
events 

Basic  motor  patterns  (actions) 
combined  to  form  complex 
sensory-motor-programs  by 
learning  (i.e.,  by  reinforcement  of 
the  successful  sensory-motor- 
patterns  in  logs) 

Various  types  supported 

Various  types  supported 

Simple  buffer  representation  of 
commands 

Object-oriented  structures 
(JAVA),  plus  relational 
modeling  support  (JACK) 

All  JAVA  support  including  database 
interfaces  etc.  Support  for  data 
modeling  in  JAVA  and  C++  using 
JACOB  (JACK  Object  Builder) 

JAVA  methods 

List  structures 

List  structures,  rules,  and  arbitrary 
Pop-1 1  data  structures.  Can  also  use 
neural  nets.  If  required 

Defined  by  methods  for  each 
agent  class 

Usually  simple,  but  extant 

Usually  simple,  but  extant 

Usually  extant,  but  usually  not 
complex 
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Architecture 
1  EPAM 


2  SDM 


3  PSI 


4  COGENT 


5  JACK  as  an  example  of 
BDI  architectures 


6  SimAgent  Toolkit 


7  Engineering-based 
models  (e.g..APEX) 


Knowledge  Representation 

Declarative 

Chunks,  schemas  (templates);  using  nodes 
in  discrimination  net 


A  sparse  set  of  memory  addresses  where 
data  are  addresses 


Sensory  and  sensory-motor  patterns 
consisting  of  pointer  structures  forming 
schemas.  A  schema  includes  information 
about  more  basal  elements  and  relations  of 
elements  in  space  and  time,  including 
language  patterns  pointing  to  sensory  and 
sensory-motor  patterns  (implementation  in 
progress) 


Numbers,  strings,  lists,  tuples,  connectionist 
networks 


Object-oriented  structures  (JAVA),  plus 
relational  modeling  support 


List  structures  and  arbitrary  Pop-1 1  data 
structures  (e.g.,  could  be  constrained  to 
express  logical  assertions  but  need  not  be). 
Could  use  neural  nets  or  other  mechanisms 


Varies,  but  usually  simple 


Procedural 

Productions  using  nodes  in 
discrimination  net 


Memories  naturally  form  sequences 
that  could  be  considered  procedures 


Sensory-motor-patterns  forming 
automatisms 


Production  rules,  connectionist 
networks,  Prolog 


JACK  plans  and  JAVA  methods 


Rule  sets  and  arbitrary  Pop-1 1 
procedures  that  can  also  invoke  Prolog 
or  external  functions 


Varies,  but  usually  simple.  Many  use 
some  form  of  schemas 
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Higher-Level  Cognitive  Functions 
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Learning  Planning  Decision  Making  Situation 

Assessment 

1  Chunking,  creation  of  Connections  between  tern-  Knowledge  based  Overt  and  inferred 

schemas,  and  production  plates  used  in  planning 
learning  online  (incremen¬ 
tal)  and  stable  against 
erroneous  data 


2  By  incrementing  weights  Does  not  plan,  but  can 
across  a  probability  remember  plans 

distribution 


Iterative  memory  recall  Can  learn  a  set  of 

process  assessments  and 

generalize  these 


3 


Associative  and  perceptual 
learning:  operant 
conditioning:  sensory-motor 
learning,  learning  goals 
(situations  that  allow  need 
satisfaction)  and  aversions 
(situations  or  objects  that 
cause  needs) 


Built-in  hill-climbing  proce¬ 
dure:  action  schemata  (i.e., 
sensory-motor-pattems)  are 
recombined  to  form  new 
plans.  If  planning  unsuc¬ 
cessful  or  impossible  due  to 
lack  of  information,  trial- 
and-error  procedures  used 
to  collect  environmental 
Information 


Expectancy-value- 

principle 


Built  In  as  part  of  problem 
solving 


4  Common  methods  within 
connectionist  modules 


Could  be  implemented  in 
rule  modules 


Specific  to  module  type.  None  built  in 
Can  vary  (users  can  spedfy) 


5  None  built  in 

(users  can  spedfy  as 
required  by  their 
architecture) 


None  built  in 
(users  can  specify  as 
required  by  their 
architecture) 


Includes  BDI  Includes  BDI 

computation  model  computation  model 


6  None  built  in 

(e.g.,  Wright  et  al.  1996, 
included  simple  forms  of 
deliberative  mechanisms 
and  m  eta -management) 


None  built  in 
(users  can  specify  as 
required  by  their 
architecture).  Logan’s  A* 
with  bounded  constraints 
available,  among  others 


None  built  in 
(users  can  spedfy  as 
required  by  their 
architecture) 


None  built  In 
(users  can  spedfy  as 
required  by  their 
architecture) 


7  Usually  not  extant 


Varies,  some  models  do  Usually  good;  decision  Varies,  often  implicit 
well  making  domain  of  these 

models 
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Table  5.1:  Comparison  of  Architectures  (continued) 

Multitasking 


Chapter  5.  Recent  Devebpments  for  Modeling 


Architecture 


1  EPAM 


2  SDM 


3  PSI 


4  COGENT 


5  JACK  as  an  example  of  BDI 
architectures 


6  Sim  Agent  Toolkit 


7  Engineering-based  models 
(e.g.,  APEX) 


Serial/Parallel 


Resource  Representation 


Serial  processing;  learning  done  in 
parallel 


Fully  parallel  recall  process,  serial 
recall  of  sequences 


System  tries  to  fulfill  different  needs 
(I.e.,  water,  energy,  pain-avoidance, 
etc.);  interrupts  goal-directed 
behavior  to  profit  from  unexpected 
opportunities 


Modules  can  work  in  parallel,  but 
information  passed  between  them 
serially 


Supports  multiple  computational 
threads  handled  safely  within  the 
JACK  Kernel — achieving  atomic 
reasoning  steps 


Discrete  event  simulation  technique, 
with  rule  sets  within  each  agent  time- 
sliced,  as  well  as  different  agents 
being  time-sliced 


Varies,  sometimes  explicit  models 


Limited  STM  capadty,  limited  perceptual 
and  motor  resources  (uses  time 
parameters) 


Allocation  of  time  to  run  intention  according 
to  strength  of  underlying  need  and 
according  to  expectancy  of  success 


Would  vary  with  the  knowledge  Included  In 
modules 


Agents  have  time  perception.  Time  can  be 
real  or  simulated  (dilated,  externally 
synchronized,  etc.) 


Allocation  of  cycles  per  time-slice  can  be 
made  for  each  rule  set,  or  for  each  agent 
No  built-in  memory  resource  limits.  Will 
differ  for  each  architecture  type  created 


Varies.  Those  that  interact  with  simulations 
more  advanced 


Architecture  too  low-level  for 
representation  to  be  explicit 
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Table  5.1:  Comparison  of  Architectures  (continued) 

Multiple  Human  Modeling  Implementation  Platform 


Goal/Task  Management 


Bottom  up  +  1  main  goal 
per  task  simulated 

Potential  through  multiple  EPAM 
modules 

Mac,  PC  (any  system  supporting 
Common  Lisp).  Graphic  environment 
supported  only  for  Macintosh 

None 

None 

UNIX  (easily  ported) 

There  is  a  steady 
competition  of  different 
needs/motives  to  rule. 
Strongest  will  win  and 
inhibit  others 

Potential  through  multiple  PSI 
models  with  different 
“personalities"  by  varying  starting 
parameters.  Multiple  agents  can 
run  in  same  environment,  see 
each  other.  Interact,  and,  to  a 
certain  degree,  communicate 

Windows  95,  98,  2000,  NT 

None  built  in.  Users  can 
specify  through  module 
selection  and 
programming 

None 

UNIX  (X  windows).  Microsoft  Windows 

Built  in.  JACK  Language 
includes:  wait_for 
(condition),  maintenance 
conditions,  meta-level 
reasoning,  etc. 

Allows  multiple  agents,  running 
together  or  distributed,  to  interact 
and  communicate  as  a  team  or  as 
adversaries.  Extensions  to  the 
basic  model  (e.g.,  team  models 
also  allowed) 

Runs  on  all  platforms  that  support  JAVA 
1.1.3  or  later.  Graphic  components  (i.e., 
development  environment)  require  JAVA 
2  V  1 ,2  or  later 

None  built  in 
(users  can  specify  as 
required  by  the 
architecture) 

Toolkit  allows  multiple  agents  to 
sense  one  another,  act  on  one 
another,  and  communicate  with 
one  another 

Runs  on  any  system  supporting  Poplog 
(and  for  graphics,  X  window  system). 
Tested  on  Sun/Solaris,  PC/Linux,  DEC 
Alpha/UNIX.  Should  also  run  on  other 
UNIXes  and  VAX  VMS.  Should  work 
without  graphics  on  Windows  NT  Poplog 

Varies.  Some  advanced 

Some  have  none;  some  work  in 
teams 

Varies.  Not  usually  designed  for 
dissemination 
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Table  5.1  Comparison  of  Architectures  (continued) 

Architecture  Implementation  Language  Support  Environment 


1  EPAM  Common  Lisp  Lisp  programming  +  editing  tools.  Some 

graphic  utilities  for  displaying  eye 
movements,  structure  of  discrimination 
tree,  and  task.  Customized  code  used  for 
each  task  modeled 


2  SDM 


C.  JAVA 


None 


3  PSI 


Pascal  (Delphi  4) 


Delphi  4  features 


4  COGENT 


Prolog 


Graphic  and  textual  editors 


5  JACK  as  an  example  of  BDI  JAVA.  JACK  written  in  and  JACK  Make  utilities,  and  all  available 

architectures  compiles  into  pure  JAVA  JAVA  tools.  JACK  development 

environment  (JDE)  provides  GUI  for 
creating  and  editing  agent  structures. 
Further  debugging  and  visualization  tools 
under  development 


6  Sim  Agent  Toolkit 


Pop-1 1  (but  allows  invocation  of 
other  Poplog  languages  (Prolog, 
Common  Lisp,  Standard  ML,  & 
external  functions,  e  g.,  C) 


Poplog  environment,  including 
VED/XVED,  libraries,  incremental 
compiler,  etc 


7  Engineering-based  models  Varies 

(e.g.,  APEX) 


Often  simple 
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Table  5.1:  Comparison  of  Architectures  (continued) 


Validation 

Comments 

1 

Extensive  at  many  levels 

EPAM  models  focus  on  single,  specific 
information  processing  task  at  a  time.  Not  yet 
scaled  up  to  multitasking  situations.  Used  in 
high-knowledge  domains  (e.g.,  chess,  with 
about  300,000  chunks) 

2  None  SDM  should  be  seen  as  system  component 

(e.g.,  good  way  of  representing  long-term 
memory  for  patterns  and  motor  behaviors  in 
larger  system) 


3  Achievement  data  and  parameters  of  behavior 

compared  between  subjects  and  models  in  two 
different  scenarios  (BioLab  and  Island).  Different 
human  subjects  can  be  modeled  by  varying 
parameters 


4  Would  be  by  architecture.  Some  have  been  done  by 

modeling  previously  validated  models 


5  Would  be  by  architecture.  None  known 


6  Would  be  by  architecture.  None  known 


7  By  model.  Usually  validated  with  expert  opinion.  Some  Wide  range  of  models  here 

may  be  compared  with  data 
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Review  of  Recent  Developments  and 
Objectives:  Specific  Projects 


We  now  examine  speeifie  projeets  within  the  general  applieation  areas  noted  in 
Chapters  2,  3,  and  4,  broadly  grouped  into  projects  that  support  the  objeetives  in  the 
previous  ehapters,  that  is,  of  providing  more  eomplete  perfomiance,  supporting  integration 
of  models,  and  improving  model  usability.  The  format  of  the  projeets  follows  the  general 
format  used  in  Pew  and  Mavor  (1998).  Where  appropriate,  this  summary  also  eomments  on 
the  feasibility  and  coneems  that  may  arise  if  the  projects  are  implemented  in  Soar,  a  current 
eommon  approach  for  computer-generated  forces.  The  estimates  are  uniformly  optimistic  to 
allow  comparisons.  The  estimates  are  in  ternis  of  programmer  or  analyst  time,  and  assume 
adequate  supervision  and  cooperation  with  other  organizations. 

6.1  Projects  Providing  More  Complete  Performance 

The  projects  presented  here  address  the  issues  raised  in  Chapter  2.  They  are  grouped 
into  three  main  categories.  We  also  note  some  potential  additional  uses  for  models  of 
behavior  as  well  as  current  uses  in  synthetic  environments. 

6.1.1  Gathering  Data  From  Simulations 

It  is  very  clear  and  consonant  with  Pew  and  Mavor  ( 1 998,  chap.  1 2)  that  data  need  to  be 
gathered  to  validate  models  of  human  and  organizational  behavior.  An  approach  at  which 
they  hint  is  to  instrument  synthetic  environments.  Synthetic  environments  should  be 
instrumented  not  only  for  playback,  but  also  in  a  way  to  provide  data  for  developing  and 
testing  models.  While  the  data  are  not  directly  equivalent  to  real-world  behavior,  as  the 
environment  becomes  more  realistic  the  data  should  become  more  realistic  as  well. 

A  unifonn  representation  for  data  from  simulations  should  be  created.  This 
representation  should  be  readable  by  humans,  at  least  in  some  fonnats. 

Creating  summary  measures  will  also  be  necessary.  Otherwise  the  sheer  volume  of  data 
may  preclude  its  analysis.  The  individual  actions  of  control  are  not  likely  to  be  useful  on 
their  own  (e.g.,  pressing  an  accelerator)  but  will  be  required  to  build  higher-level 
summaries.  Creating  these  summaries  is  likely  to  represent  an  additional  research  agenda 
item  requiring  AI,  domain  knowledge,  and  some  understanding  of  behavioral  data. 

The  payback  could  be  quite  large  for  developing  models.  Analysis  of  data  from 
synthetic  environments  might  also  provide  insights  into  the  quality  of  the  simulation  (e.g., 
how  quickly  someone  could  act  and  whether  they  were  limited  by  the  simulation’s  ability  to 
display  infonnation)  and  provide  insights  about  the  implementation  of  doctrine  (e.g.,  how 
often  tank  crews  actually  follow  doctrine).  When  done  in  cooperation  with  a  simulator’s 
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developers,  the  resources  required  for  this  task  eould  be  quite  modest.  Otherwise,  it  eould 
take  some  time.  Developing  initial  automated  summaries  is  a  6-  to  1 2-month  effort. 

6.1.2  Understanding  Expectations  of  Behavior 

Providing  realistic  behavior  requires  understanding  what  people  expect  from  other 
people  and  what  aspects  of  an  adversary  are  necessary  for  training.  (These  two  may  be  quite 
different.)  In  one  sense,  this  means  understanding  the  Turing  test:  what  is  necessary  to 
appear  human?  More  important,  however,  is  knowing  what  is  necessary  to  train  people.  A 
model  that  passes  the  Turing  test  and  appears  human  might  be  weaker  or  unusual  in  some 
way.  Thus,  training  with  the  model  might  not  result  in  transfer  of  learning  or  result  in 
learning  an  incorrect  behavior. 

A  useful  exercise  would  be  to  study  which  eharaeteristies  of  behavior  make  a  model 
appear  human  (so-called  believable  agents).  The  model  must  start  with  competencies;  it 
must  be  able  to  perform  tasks.  It  should  also  include  errors,  hesitations,  and  variations 
in  behavior. 

Work  with  the  Soar  Quake-bot  on  how  firing  accuracy  and  movement  speed  make 
agents  believable  is  an  example  of  what  is  required  (Laird  &  Duehi,  2000)  to  understand 
what  people  think  is  human.  The  Soar  Quake-bot  has  been  evaluated  on  such  things  as  firing 
accuracy  with  observers  asked  to  rate  its  humanness.  The  measure  of  humanness,  however, 
does  not  reveal  how  good  the  Quake-bot  is  with  respect  to  training.  Nor  does  it  reveal  what 
aspects  of  the  Quake-bot  should  be  made  more  (or  less)  human  to  improve  training.  The 
current  belief  is  that  appearing  (or  behaving)  more  human  makes  a  better  opponent  to  train 
against,  but  we  do  not  know  of  any  evidence  to  support  this  belief. 

Another  example  is  the  Fuzzy  Logie  Adaptive  Model  of  Emotions  (FLAME).  In  this 
work  (Seif  El-Nasr,  Yen,  &  loerger,  2000),  several  models  of  a  pet  that  followed  the  user 
around  in  a  virtual  house  were  tested  for  believability.  The  model  that  included  learning  and 
fuzzy  behavior  was  the  judged  the  best.  While  not  a  complete  test,  this  type  of  project  starts 
to  find  out  what  makes  agents  believable  through  tests.  In  this  example,  learning  and 
emotions  were  both  helpful. 

A  useful  6-month  to  1  -year  study  would  be  to  examine  a  range  of  models  and  humans  in 
a  synthetic  environment,  noting  observers’  comments  and  behavior  toward  a  range  of 
behavior.  It  might  be  that  these  aspects  make  an  agent  appear  human,  but  it  might  also 
include  implicit  effects,  such  as  second-order  (or  lagged)  dependencies  in  behavior.  The 
results  would  be  important  for  training  and  also  useful  for  creating  models  used  in  analysis. 
This  project  is  similar  to,  but  conceived  of  separately  from,  a  similar  call  proposed  by 
Chandrasekaran  and  Josephson  (1999)  to  develop  a  better  understanding  of  how  to  and  how 
far  to  develop  models. 

The  results  are  also  essential  for  understanding  how  to  help  modelers.  The  results  will 
point  out  the  most  likely  mismatches  to  be  left  in  models  because  modelers  do  not  consider 
such  behavior  abnormal.  This  will  provide  suggestions  about  where  comparisons  with  data 
are  particularly  needed  by  models.  As  this  is  basically  experimental  work,  less  resources  are 
needed,  but  the  time  to  run  the  experiment  and  analyze  the  results  will  take  up  to  a  year  for 
preliminary  studies. 
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6.1.3  Including  Learning  in  Models 

Work  on  creating  agents  in  synthetic  environments  has  been  sueeessful,  however,  one 
particularly  useful  aspect  that  has  not  been  modeled  is  learning.  A  worthwhile  project  would 
be  to  take  a  learning  algorithm  and  put  it  to  use  within  a  synthetic  environment,  either  as 
part  of  a  problem  solver  or  as  an  observer.  There  are  a  variety  of  learning  algorithms  and 
models  that  would  be  appropriate.  Some  examples  include:  eonneetionism;  one  of  the 
hybrid  learning  architectures  developed  within  the  ONR  program  (Gigley  &  Chipman, 
1999);  Programmable  User  Models  (Young,  Green,  &  Simon,  1989b);  Soar  with  learning 
turned  on;  ACT-R;  EPAM;  or  any  of  a  wide  variety  of  machine-learning  algorithms. 

Creating  a  model  that  learns  will  be  difficult.  This  task  is  large  and  would  allow 
multiple  subprojects  to  be  attempted.  It  eould  be  supported  by  a  wide  range  of  resources. 
Including  learning  with  problem  solving  has  been  difficult  in  the  past,  but  it  is  likely  to  lead 
to  more  accurate  agents  that  may  be  useful  for  testing  and  developing  tactics. 

Soar  models  exist  that  function  fairly  well  in  a  synthetic  environment.  If  these  eould  be 
used,  a  small  project  of  a  programmer-year  or  two  should  be  able  to  create  an  initial  model 
that  learns  in  a  synthetic  environment.  Attaching  a  learning  component  to  find  regularities  in 
behavior  is  likely  to  take  at  least  that  much  time.  Both  projects  would  provide  potential  PhD 
topics  and  are  broad  enough  to  be  supported  by  a  wide  range  of  resources. 

6.1.4  including  a  Unified  Theory  of  Emotions 

There  are  three  specific  projects  related  to  modeling  emotions  and  other  behavioral 
moderators  in  architectures  that  we  can  propose:  (1)  adding  general  emotional  effects,  (2) 
adding  reactive  emotions,  and  (3)  testing  emotional  models  with  perfonnanee  data.  While 
work  is  ongoing  implementing  models  like  this  in  Soar  (Chong,  1999;  Gratch,  1999)  and 
ACT-R  (Belavkin  et  al.,  1999),  the  domain  is  large.  Projects  ean  range  from  a  few  months  to 
implement  a  simple  emotional  effect  to  several  years  or  decades  to  incorporate  a 
significant  amount. 

Adding  general  emotional  affects.  As  noted  above,  it  is  possible  to  start  to  realize 
emotions  and  affective  behavior  within  toolkits  like  Sim  Agent  and  general  cognitive 
architectures  like  ACT-R  and  Soar.  Including  emotions  will  provide  a  more  complete 
architecture  for  modeling  behavior  and  a  platform  for  performing  future  studies  of  how 
emotions  affect  problem  solving.  Including  emotions  may  also  provide  a  way  to  duplicate 
personality  and  provide  another  approach  to  account  for  appropriate  variations  in  behavior. 
Hudlieka  (1997)  provides  a  list  of  intrinsic  and  extrinsic  behavior  moderators  (similar  to  the 
categories  suggested  in  Ritter,  1993b)  that  eould  be  modeled.  Boff  and  Lincoln  (1988) 
provide  a  list  of  regularities  related  to  fatigue  and  other  related  stressors  that  might  be 
considered  for  testing  against  a  general  model  of  emotional  effects.  For  example,  by  making 
ACT-R’s  motivation  sensitive  to  local  performance  (rule  successes  and  failures),  we  have  fit 
the  Yerkes-Dodson  law  (Belavkin  &  Ritter,  2000). 

These  models  should  move  from  applying  to  a  single  task  to  multiple  tasks.  They  would 
then  become  modifications  to  the  architecture  and  thus  reusable. 
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Adding  reactive  emotions.  Modeling  reactive  and  long-term  moderators  as  well  as 
slower-acting  behavioral  moderators  is  worthwhile.  The  effect  of  stress  also  changes  the 
state  of  the  competence  in  important  ways.  A  proportion  of  troops  engaged  in  active  combat 
will  become  ineffective  as  a  result  of  fear  and  stress-fatigue.  Stress  would  also  be  increased 
by  the  number  of  casualties  taken  by  a  given  platoon,  length  of  time  without  sleep,  weather 
conditions,  perceived  chance  of  survival,  and  so  on.  Modeling  these  effects  at  the  micro- 
level  of  individuals,  following  known  distributions,  would  advance  the  realism  of 
simulations  in  interesting  ways  and  support  teaching  existing  doctrine. 

In  production  system  architectures,  these  emotions  can  initially  be  implemented  by 
changing  the  decision  (rule-matching)  procedure,  adding  rules  to  make  parameter  changes, 
and  by  augmenting  working  memory  to  include  affective  information  (e.g.,  an  operator  or 
state  looks  good  or  bad).  These  types  of  changes  arc  being  applied  to  an  existing  model, 
which  matches  adult  behavior  well,  to  better  match  children’s  more  emotional  behavior 
(Belavkin  et  al.,  1999).  These  emotional  effects  should  improve  the  match  to  the  children’s 
performance  by  (1)  slowing  down  performance  in  general,  (2)  slowing  down  initial 
performance  as  the  child  explores  the  puzzle  driven  by  curiosity,  and  (3)  abandoning  the 
task  if  performance  is  not  successful.  This  work  should  be  extended  and  applied 
more  widely. 

Testing  emotional  models  with  performance  data.  Many  of  the  theories  of  emotions 
proposed  have  not  been  compared  with  detailed  data.  Partly  this  may  be  because  there  is  not 
always  a  lot  of  data  available  on  how  behavior  changes  with  emotions.  It  is  no  doubt  a 
difficult  factor  to  manipulate  safely  and  reliably.  But  the  models  must  not  just  be  based 
on  intuitions. 

The  use  of  simulators  may  provide  a  way  to  obtain  further  data  with  some  validity. 
Better  instrumentation  of  some  primary  features  of  emotions  (e.g.,  heart  rate,  blood 
pressure)  is  providing  new  insights  (Picard,  1997;  Stem,  Ray,  &  Quigley,  2001)  and  will  be 
necessary  for  testing  models  of  emotions. 

Some  argue  that  emotions  are  necessary  for  problem  solving.  Examples  of  brain¬ 
damaged  patients  (e.g.,  Damasio’s  Elliot  [1994]),  who  have  impaired  problem  solving  and 
impaired  emotions,  arc  put  forward.  It  is  not  clear  that  emotions  per  se  are  required,  or  if 
multiple  aspects  of  behavior  are  impaired  as  well  as  emotions  by  the  trauma.  Others  argue 
from  first  principles  that  emotions  (realized  as  changes  in  motivation  due  to  local  success 
and  failure  during  problem  solving  in  this  example)  can  improve  perfonnance  (Belavkin, 
2001).  A  model  compared  with  data  may  help  answer  whether  this  is  true.  Clearly,  AI 
models  of  scheduling  do  not  have  the  same  troubles  scheduling  an  appointment  despite  their 
lack  of  emotion. 

6.1.5  Including  Errors 

There  are  two  premises  that  underpin  the  modeling  of  erroneous  behavior.  The  first 
premise  is  that  the  attribution  of  the  label  error  to  a  particular  action  is  a  judgment  made  in 
hindsight.  The  identification  of  the  erroneous  action  forms  the  starting  point  for  further 
investigation  to  identify  the  underlying  reasons  why  a  particular  person  executed  that 
particular  action  in  that  particular  situation.  In  other  words,  the  erroneous  action  arises  as  the 
result  of  a  combination  of  factors  from  a  triad  of  sources:  the  person  (psychological  and 
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physiological  factors),  the  system  (in  the  most  general  sense  of  the  tenn),  and  the 
environment  (including  the  organization  in  which  the  system  is  deployed). 

The  second  premise  acknowledges  that  an  error  is  simply  another  aspect  of  behavior.  In 
other  words,  any  theory  of  behavior  should  naturally  encompass  erroneous  behavior.  The 
behavior  ean  be  judged  as  erroneous  only  with  respect  to  a  description  of  what  constitutes 
correct  behavior. 

Onee  these  premises  are  accepted,  it  becomes  apparent  that  modeling  erroneous 
behavior  is  actually  an  inherent  and  important  part  of  modeling  behavior.  If  the 
psychological  and  physiological  limitations  of  human  behavior  are  incorporated  into  a 
model  of  human  behavior,  then  particular  types  of  erroneous  behavior  should  naturally 
occur  in  certain  specific  situations.  The  corollary  of  this  argument  is  that  an  understanding 
of  erroneous  behavior  can  be  used  as  the  basis  for  evaluating  models  of  behavior.  So,  if  a 
human  performs  a  task  correctly  in  a  given  situation,  the  model  should  also  be  able  to 
perform  the  task  correctly  in  the  same  situation.  If  the  situation  is  changed,  however,  and  the 
human  generates  erroneous  behavior  as  a  result,  the  model  should  also  generate  the  same 
erroneous  behavior  as  the  human  in  the  new  situation,  without  any  modifications  being 
required  to  the  model. 

Modeling  error  therefore  depends  on  understanding  the  concept  of  error— its  nature, 
origins,  and  causes — and  central  to  this  is  the  need  for  an  accepted  means  of  describing  the 
phenomenon  (Senders  &  Moray,  1991).  In  other  words,  a  taxonomy  of  human  error  is 
required  with  respect  to  these  tasks. 

The  utility  of  the  taxonomic  approach,  however,  depends  on  the  understanding  that  the 
taxonomy  is  generated  with  a  particular  purpose  in  mind.  In  other  words,  the  taxonomy  has 
to  reflect: 

•  A  particular  notion  of  what  constitutes  an  error. 

•  A  particular  level  of  abstraction  at  which  behavior  is  Judged  to  be  erroneous. 

•  A  particular  task  or  domain. 

There  is  a  need  to  be  very  clear  about  the  classes  of  errors  and  their  origin  in  the  models 
so  that  the  appropriate  ones  ean  be  included.  In  the  military  context,  for  example,  a  major 
source  of  error  is  communication  breakdown.  One  approach  to  developing  an  appropriate 
taxonomy  of  errors  for  the  military  domain  is  to  use  the  scheme  that  lies  at  the  heart  of  the 
CREAM  (Hollnagel,  1998).  The  CREAM  purports  to  be  a  general  purpose  way  of  analyzing 
human  behavior  in  both  a  retrospective  and  a  predictive  manner.  Although  the  method  was 
developed  on  the  basis  of  several  years  of  research  into  human  performance,  mainly  in  the 
process  industries,  it  is  intended  to  be  applicable  to  any  domain. 

The  CREAM  uses  a  domain-independent  definition  of  what  constitutes  an  erroneous 
action  (also  called  error  modes  or  phenotypes).  One  of  the  goals  of  the  CREAM  is  to  be  able 
to  identify  the  chain  of  precursors  for  the  various  error  modes.  Identifying  the  chain  is 
achieved  by  means  of  a  set  of  tables  that  define  categories  of  actions  or  events.  At  the 
highest  level,  there  are  three  types  of  tables: 
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•  Human  (or  operator), 

•  Technological  (or  system),  and 

•  Organizational  (or  environment). 

Within  these  categories  there  are  sub-category  tables.  So,  for  example,  the  human  tables 
include  observation,  interpretation,  planning,  and  so  on. 

The  individual  actions  or  events  are  paired  together  across  tables  on  the  basis  of 
causality  or,  to  use  a  more  neutral  term,  in  a  consequent-antecedent  relationship.  When  the 
CREAM  is  used  to  analyze  a  particular  accident  or  incident  retrospectively,  the  aim  is  to 
build  up  the  list  of  possible  chains  of  events  and  actions  that  led  to  the  accident  or  incident. 

The  contents  of  the  tables  are  domain-specific,  so  the  first  step  in  developing  the 
taxonomy  for  agents  in  synthetic  environments  depends  on  identifying  the  appropriate 
categories  of  events  and  actions  for  the  military  domain.  These  categories  and  the  links 
between  individual  actions  or  events  will  be  generated  from  a  combination  of  knowledge 
elicited  from  domain  experts  and  a  review  of  the  appropriate  literature. 

The  second  step  is  to  generate  the  possible  chains  of  actions  and  events  that  precede  the 
various  error  modes,  based  on  information  available  from  reports  of  real  accidents  or 
incidents.  This  process  will  involve  access  to  desensitized  accident  or  incident  reports — like 
those  used  in  the  Confidential  Human  Factors  Incident  Reporting  Programme  (CHIRP; 
Green,  1990)  originally  operated  by  the  RAF  Institute  of  Aviation  Medicine — that  can  be 
analyzed  and  coded  using  the  domain-specific  CREAM  tables  generated  in  Step  1.  Where 
omissions  from  the  tables  are  detected,  or  links  between  actions  do  not  already  exist,  these 
should  be  added  to  the  tables. 

The  possible  causal  chains  of  events  or  actions  generated  by  the  second  step  will 
provide  the  basis  for  a  specification  of  behavior  in  a  particular  situation.  Models  of  behavior 
should  yield  the  same  sequences  of  actions  and  events  in  the  same  circumstances.  The 
specification  of  behavior  can  therefore  be  used  to  test  the  models  of  behavior  for 
compliance,  during  development,  with  the  model  being  modified  as  appropriate  to  match 
the  specification. 

In  addition,  the  results  of  the  analysis  of  the  incident  behavior  provide  a  basis  for 
evaluating  the  veracity  of  synthetic  environments.  Performance  in  the  real  world  (as 
described  in  the  incident  reports)  can  be  compared  with  the  way  people  behave  when 
performing  in  the  synthetic  environment.  There  should  be  a  high  degree  of  correspondence 
between  the  two.  If  there  is  a  mismatch,  the  mismatch  suggests  that  there  is  a  difference 
between  the  real  world  and  the  synthetic  environment,  which  may  be  worth  further 
investigation  to  identify  the  source  of  the  difference. 

One  other  beneficial  side  effect  of  the  CREAM  analysis  is  that  the  resultant  chains  of 
actions  and  events  can  be  used  in  training  personnel  to  manage  error.  If  common  chains  of 
actions  or  events  can  be  identified,  it  may  be  possible  to  train  personnel  to  recognize  these 
chains,  and  take  appropriate  remedial  action  before  the  erroneous  action  that  gives  rise  to 
the  incident  is  generated. 


58 


Human  Systems  (AC  SOAR,  2003 


Chapter  6.  Review  of  Recent  Developments  and  Objectives:  Specific  Projects 


Initial  models  that  include  erroneous  behavior  ean  best  be  created  with  an  existing 
model.  One  to  three  years  of  work  should  lead  to  an  initial  model  that  includes  some  errors 
and  has  been  validated  against  human  behavior. 

6.1.6  Including  a  Unified  Theory  of  Personality 

It  would  be  useful  to  identify  features  that  lead  to  modeling  personality,  problem¬ 
solving  styles,  and  operator  traits.  While  models  that  choose  between  strategies  have  been 
created,  there  are  few  models  that  exhibit  personality  by  choosing  between  similar  strategies 
(although  see  Nerb,  Spada,  &  Ernst,  1997,  for  an  example  used  to  put  subjects  in  a  veridical, 
but  artificial,  social  environment).  Personality  will  be  an  important  aspect  of  variation  in 
behavior  between  agents. 

Including  personality  requires  a  task  (and  the  model)  to  include  multiple  approaches  and 
multiple  successful  styles.  It  is  these  choices  that  can  thus  appear  as  a  personality.  If  the  task 
requires  a  specific,  single  approach,  it  is  not  possible  to  express  a  strategy.  Psychology,  or  at 
least  cognitive  psychology,  has  typically  not  studied  tasks  that  allow,  or  particularly 
highlight,  multiple  strategies.  Looking  for  multiple  strategies  has  also  been  difficult  because 
it  requires  additional  subjects  and  data  analysis  that  before  has  not  represented  real 
differences  in  task  pcrfonnance.  Differences  in  strategies,  however,  lead  to  variance  in 
behavior  (e.g.,  Delaney,  Reder,  Staszewski,  &  Ritter,  1998;  Siegler,  1987). 

T  here  appear  to  be  at  least  the  following  ways  to  realize  variance  in  behavior  that  might 
appear  like  personality:  learning,  differences  in  knowledge,  differences  in  utility  theoiy  and 
initial  weighting,  and  differences  in  emotional  effects.  Including  a  subset  of  these  effects  in 
a  model  would  fulfill  a  need  for  a  source  of  regular,  repeatable  differences  between  agents 
in  a  situation. 

All  of  the  current  cognitive  architectures  reviewed  here  and  in  Pew  and  Mavor  (1998) 
can  support  models  of  personality.  These  types  of  changes  should  be  straightforward,  as 
long  as  there  are  multiple  strategies.  In  Soar,  personality  ean  be  expressed  as  differences  in 
task  knowledge,  as  well  as  differences  in  knowledge  about  strategy  preferences  either 
absolutely  or  based  on  different  sets  of  state  and  strategy  features.  ACT-R  appears  to  Icam 
better  and  faster  which  strategy  to  use  compared  with  a  simple  Soar  model,  but  ACT-R 
requires  additional  state  features  (Ritter  &  Wallach,  1998).  Models  in  both  architectures  can, 
however,  modify  their  choice  of  strategies.  The  role  of  (multiple)  strategies  has  been 
investigated  within  the  EPAM  architecture  in  several  tasks,  such  as  concept  formation 
(Gobet  ct  al.,  1997)  and  expert  memory  (Gobet  &  Simon,  2000;  Riehman,  Gobet, 
Staszewski,  &  Simon,  1996;  Riehman  et  al.,  1995). 

These  models  could  also  be  crossed  with  emotional  and  other  non-cognitivc  effects  to 
see  how  personality  types  respond  differently  in  different  eireumstanees  (broadly  defined). 
This  could  even  be  extended  to  look  at  how  teams  with  different  mixes  of  personalities  work 
together  under  stress. 

The  amount  of  work  to  realize  a  model  in  this  area  will  depend  on  the  number  of  factors 
taken  into  account  by  the  model.  Providing  a  full  model  of  personality  and  how  it  interacts 
with  tasks  and  with  other  models  is  a  fantasy  at  this  point.  However,  a  minimal  piece  of 
work  would  take  an  existing  model  and  give  it  more  of  a  personality.  A  more  extensive 
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project  over  a  year  or  two  would  apply  several  of  these  techniques  and  see  how  it  starts  to 
match  human  data. 


6.1.7  Including  a  Model  of  Situation  Awareness  and  Rapid  Decision  Making 

Novices  have  to  do  problem  solving.  Experts  can  do  problem  solving  but  save  effort 
(or  improve  their  problem-solving  performance)  by  recognizing  solutions  based  on  the 
problem.  Viewed  broadly,  a  model  that  docs  this  transition  starts  to  provide  an 
explanation  of  situation  awareness  and  Rapid  Decision  Making  (RDM)  as  a  result  of 
expertise  and  recognition. 

Able  (Larkin,  1981)  and  its  recent  re-implementations  (Ritter  et  al.,  1998b)  provide  a 
simulation  explaining  the  path  of  development  from  novice  to  expert  in  formal  domains 
(i.e.,  those  where  behavior  is  based  on  manipulated  equations  such  as  physics  or  math).  The 
early  (or  barely)  Able  model  works  with  a  backward-chaining  approach,  that  is,  it  starts  with 
what  is  desired  and  chooses  domain  principles  to  apply  based  on  what  will  provide  the 
desired  output.  This  approach  is  applied  recursively  until  initial  conditions  are  found.  The 
chunking  mechanism  in  Soar  gives  rise  to  new  rules  that  allow  the  model  to  use  a  forward- 
chaining  method  that  is  faster.  That  is,  from  the  initial  conditions  new  results  are  proposed. 
The  rules  are  applied  until  the  desired  result  is  found.  Students  at  the  University  of 
Nottingham  have  applied  the  Able  mechanism  to  several  new  domains.  Their  examples  are 
available  at  www.nottingham.ac.uk/pub/soar/nottingham/student-projects.html. 

Work  could  be  done  to  translate  this  mechanism,  which  has  worked  in  Lisp  and  in 
several  versions  of  Soar,  into  other  architectures  and  extend  it  from  a  simulation  to  a  full 
process  model.  This  would  require  a  rather  modest  amount  of  effort,  less  than  a 
programmer-year  to  get  started  if  the  programmer  was  familiar  with  Soar.  Applying  it  in  a 
realistic  domain  would  take  longer. 

6.1.8  Using  Tabu  Search  to  Model  Behavior 

The  internal  architecture  of  a  combatant  might  be  constructed  from  a  perceptual  module 
that  is  closely  coupled  to  the  synthetic  environment  and  can  be  modified  by  plug-in  items 
that  alter  the  incoming  data  to  be  processed  (night-vision  aids,  etc.).  The  results  of 
perception  are  crudely  classified  using  a  learning  system  such  as  a  multi-layer  perceptron, 
which  triggers  a  rapid  emotional  response  and  consequent  reactive  behavior.  This  behavior 
might  be  generated  using  an  SDM  that  finds  the  nearest  match  to  previous  scenarios  and  is 
capable  of  producing  a  sequence  of  outputs  rather  than  a  single-state  result.  Both  perception 
and  emotional  response  are  calibrated  by  a  perceptual  and  personality  model  that  may  be 
unique  to  individual  entities,  albeit  assigned  from  a  known  distribution. 

The  cognitive  processing  would  be  rule-based  using  an  established  cognitive  model,  for 
example,  ACT-R,  with  planning  activities  augmented  by  a  Tabu  search.  There  would  be 
interactions  between  the  state  of  the  entity  (including  its  emotional  state)  and  the  cognitive 
processing  based  on  psychological  data  on  human  performance  under  stress.  This  approach 
is  similar  and  perhaps  a  generalization  of  Sloman’s  meta-arehiteeture,  and  the  Soar  and 
PSl  architectures. 
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6.2  Projects  Supporting  Integration 

riic  projects  presented  here  roughly  address  the  issues  raised  in  Chapter  3.  Integration  is 
approached  in  two  ways  here:  integrating  model  components  and  integrating  the  model  with 
simulations  in  more  psychologically  plausible  ways.  Several  projects  described  in  this 
subsection  could  be  equally  at  home  in  the  set  of  projects  for  making  modeling  routine 
because  the  two  areas  arc  related. 

6.2.1  Models  of  Higher-Level  Vision 

It  has  been  argued  that  an  understanding  of  higher-level  vision  is  neeessary  for 
continued  development  of  models  in  synthetic  environments  (Laird,  Coulter,  Jones,  Kenny, 
Koss,  &  Nielsen,  1997)  and  we  agree  (Ritter  ct  al.,  2000).  Neisser’s  (1976)  perceptual  cycle 
is  just  starting  to  be  explored  with  models. 

There  arc  several  areas  of  Higher-Level  Vision  (HLV)  that  arc  of  particular  interest  for 
military  modeling.  These  areas  include: 

•  How  information  from  long-term  memory  indicates  incoming  danger  or 
serious  change  in  the  environment. 

•  How  HLV  dircets  attention. 

•  How  HLV  integrates  various  aspects  of  infomiation,  or  integrates 
information  occurring  at  different  times. 

•  How  HLV  can  be  used  to  facilitate  learning. 

•  How  HLV  can  be  used  in  planning  and  problem  solving. 

To  put  it  simply,  HLV  is  at  the  interface  between  Lower- Level  Vision  (LLV)  and 
postulated  memory  entities  such  as  productions,  schemas,  concepts,  and  so  on.  At  the 
present  time,  this  interface  is  poorly  understood,  perhaps  because  LLV  and  long-tenn 
memory  are  not  understood  in  a  sufficiently  stable  way.  (However,  see  Kosslyn  &  Koenig, 
1992,  for  neuropsychological  hypotheses  about  HLV.) 

Most  models  of  cognition  such  as  Soar  and  ACT-R  (actually,  most  architectures 
reviewed  by  Pew  &  Mavor,  1998)  use  modelcr-codcd  information,  which  avoids  dealing 
with  the  interface  between  LLV  and  long-term  memory  constructs.  Neural  nets  for  vision 
have  been  used  to  go  from  pixel-like  information  to  features  or  even  higher  but  have  not 
been  incorporated  into  higher-cognition  models.  CAMERA  (Tabachneck-Schijf,  Leonardo, 
&  Simon,  1997),  and  to  a  certain  extent  FPAM  (Feigenbaum  &  Simon,  1984;  Richman  & 
Simon,  1989),  explore  ways  in  which  features  may  be  extracted  from  low-level 
representation,  and  may  be  combined  into  long-term  memory  constructs. 

The  relationship  of  HLV  and  problem  solving  is  undoubtedly  an  area  where  more 
research  should  be  carried  out.  For  example,  modeling  instruction  and  training  requires  a 
theory  of  how  low-lcvel  acoustic  input  merges  with  low-level  visual  input  and  connects  to 
long-term  memory  knowledge.  In  some  cases  vehicles  and  gunfire  will  be  heard  rather  than 
seen  and  sounds  will  direct  visual  attention  in  the  appropriate  direction.  Perceptual  models 
of  hearing  are  also  well-developed  and  exploited  with  dramatic  success  in,  for  example,  the 
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MPEG-2  compression  standard  that  is  likely  to  form  the  basis  for  much  broadcast  and 
recorded  sound  in  the  future.  The  variance  among  individuals  is  large  for  both  auditory  and 
visual  perception,  and  both  processes  arc  degraded  temporarily  or  permanently  by  intense 
overload,  as  is  likely  in  a  military  environment. 

Work  extending  this  approach  to  create  integrated  architectures  (Byrne  ct  al.,  1999;  Hill, 
1999;  Ritter  &  Young,  2001)  is  ongoing.  Significant  progress  will  require  at  least  a  year¬ 
long  project,  and  a  longer  fieriod  would  be  more  appropriate. 

6.2.2  Tying  Models  to  Task  Environments 

Tying  cognitive  models  to  synthetic  environments  in  psychologically  plausible  ways 
should  be  easier.  There  are  two  approaches  that  seem  particularly  useful  and  plausible  that 
we  can  ground  with  particular  suggestions  for  work.  They  are  consistent  with  Pew  and 
Mavor’s  (1998,  p.  200)  short-term  goal  for  perceptual  front -ends. 

The  first  approach  is  to  provide  a  system  for  cognitive  models  to  access  ModSAF’s 
display  and  pass  commands  to  it.  This  approach  has  the  advantage  that  it  hides  changes  in 
ModSAF  from  the  programmer/analyst  and  from  the  model.  The  disadvantage  is  the  need 
for  ModSAF  experts,  programmers,  users,  time,  and  money  to  make  it  work.  There  has  been 
such  a  system  created  for  Soar  models  to  use  ModSAF  (Schwamb,  Koss,  &  Keirsey,  1994), 
but  it  is  our  impression  that  this  system,  although  it  was  quite  useful,  needs  further 
development  and  dissemination. 

The  second  approach  is  to  create  a  reusable  functional  model  of  interaction  based  on  a 
particular  graphics  system  or  interface  tool  (as  does  the  Nottingham  Functional  Interaction 
Architecture  and  ACT-R/PM).  A  functional  rather  than  a  complete  model  may  be  more 
appropriate  here  as  a  first  step.  This  functional  approach  has  been  already  created  in  Tcl/Tk 
(Lonsdale  &  Ritter,  2000),  Garnet  and  Common  Lisp  (Ritter  et  al.,  2000),  Visual  Basic 
(Ritter,  2000),  Windows  bitmaps  (St.  Amant  &  Riedl,  2001),  Windows  98  objects  (Misker, 
Taatgen,  &  Aasman,  2001),  and  most  recently  in  JAVA.  They  could  be  created  in  Amulet, 
X-windows,  Delphi,  or  a  variety  of  similar  systems,  each  of  which  allows  models  to  interact 
with  synthetic  environments  through  a  better  programming  interface.  A  functional  model 
would  then  provide  the  necessary  basis  for  improving  the  accuracy  and  psychological 
plausibility  of  interaction. 

This  approach  to  providing  models  access  to  information  in  simulations  could  also 
support  creating  cognitive  models  in  general,  such  as  for  problem  solving,  working  memory, 
and  the  effect  of  visual  interaction.  These  could  be  later  assimilated  back  into  models  and 
architectures  in  the  synthetic  environments. 

An  excellent  programmer  very  familiar  with  their  language  can  now  create  an  initial 
system  in  about  two  weeks.  Integrating  and  applying  these  models  takes  several  months 
to  a  year. 

6.2.3  Ongoing  Review  of  Existing  Simulations 

To  provide  for  reuse  and  to  understand  the  current  situation,  a  review  of  simulation 
systems  used  (for  as  broad  a  geographic  region  as  possible,  working  with  allied  nations  if 
possible)  should  be  created  that  is  similar  to  the  listing  in  Pew  and  Mavor  (1998,  chap.  2 
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Annex).  This  listing,  for  example,  could  initially  be  created  by  an  intercalated  year  (eo-op) 
student  and  then  maintained  as  part  of  standing  infrastructure.  This  listing  could  provide  an 
initial  basis  for  understanding  what  the  total  needs  were  and  the  totality  of  current 
simulation  efforts.  While  the  U.S.  Defense  Modeling  and  Simulation  Office  may  do  this  in 
the  United  States,  we  do  not  know  of  similar  efforts  in  the  United  Kingdom. 

6.2.4  Focus  on  a  Flagship  Task 

Supporting  all  the  uses  of  synthetic  forces  as  shown  earlier  in  Table  1.1  with  a  single 
model  of  behavior  is  probably  impossible  in  the  short  tenn.  The  uses  of  simulations  in 
operations  research,  training  individual  group  behavior,  and  examining  new  materials  or 
doctrine  are  too  disparate  to  be  met  by  a  single  approach.  While  the  various  levels  and  uses 
of  simulations  mentioned  here  are  related  by  the  real  world  they  all  represent,  it  does  not 
appear  to  be  possible  in  the  next  5  to  10  years  to  integrate  them  to  the  extent  to  which  the 
real  world  is  integrated. 

While  there  may  be  some  systems  that  allow  multiple  use,  and  there  will  certainly  be 
some  reuse  between  these  areas,  a  focus  for  work  must  be  selected.  Therefore,  a  more 
narrow  focus  on  the  most  important  uses  should  be  adopted  by  funding  agencies.  Taking  a 
more  focused  approach  appears  to  be  happening  in  several  places  already.  A  selective  focus 
on  the  most  approachable  or  natural  set  of  uses  is  more  likely  to  be  successful  in  the  short 
term  and  may  provide  a  better  foundation  upon  which  to  build  in  the  long  term.  Discussion 
of  these  issues  should  be  grounded,  if  possible,  with  a  set  of  potential  uses  with  possible 
systems  and  domains  that  will  be  used  in  the  next  5  to  10  years.  Complete  unification  is  not 
likely  in  that  time  period,  nevertheless  significant  reuse  should  be  sought. 

Having  a  focus  would  also  support  the  choice  of  a  specific  application.  Applications  can 
then  be  chosen  with  a  user  audience  in  mind.  Having  a  speeifie  audience  will  help  the 
application  to  be  useful  and  seen  as  useful  by  a  well-defined  user  community. 

Work  that  attempts  to  serve  too  many  needs  will  serve  all  of  them  poorly.  Projects  and 
research  programs  will  have  to  pick  a  domain  and  an  application  (or  two),  and  work  with 
them.  This  application  could  be  an  existing  use  or  application  or  it  could  be  a  new  use. 
Work  with  simulations  for  training  often  have  high  payoffs.  Augmenting  existing  training 
would  be  a  natural  place  to  consider  starting. 

The  students  being  trained  could  also  be  used  to  help  test  the  simulation.  Apocryphal 
tales  from  MIT  suggest  that  building  computer-based  tutors  to  deliver  instruction  is  as 
useful  for  learning  as  using  the  resulting  tutors.  Creating  and  validating  these  models  would 
be  good  training  for  such  students  as  well. 

6.2.5  A  Framework  for  Integrating  Models  With  Simulations 

Perhaps  the  most  significant  current  requirement  is  a  way  to  integrate  multiple  cognitive 
and  behavioral  architectures  into  synthetic  environments.  Currently,  it  takes  a  large  amount 
of  effort  to  introduce  new  models  of  behavior  and  connect  them  directly  to  simulations  via 
the  Distributed  Interactive  Simulation  (DIS)  protocol.  Coupling  cognitive  architectures  to  a 
simulation  via  ModSAF  is  probably  marginally  easier  because  ModSAF,  while  difficult  to 
use,  provides  physical  models  and  an  interface  to  the  network,  fhe  left-hand  side  of  Figure 
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6.1  shows  the  organization  of  systems  like  Tae-Air  Soar  that  interaet  with  ModSAF  to 
generate  behavior. 

A  worthwhile  medium-  to  long-range  goal  would  be  to  develop  utilities  to  support 
making  a  tool  like  ModSAF  even  more  modular.  The  eore  aetivities  of  supporting 
eommunieation  aeross  the  network  for  simulation  and  supporting  the  physieal  model  need  to 
be  provided,  but  are  not  of  partieular  interest  for  modeling  behavior. 

Efforts  have  attempted  to  provide  similar  interfaees  for  Soar;  however,  they  have  never 
been  fully  sueeessful.  They  have  made  hooking  up  Soar  easier  but  have  not  yet  made  it  easy 
(e.g.,  Ong,  1995;  Ong  &  Ritter,  1995;  but  also  see  the  most  reeent  work  by  Jones,  2001,  and 
Wallaee,  2001).  Work  on  the  Tank-Soar  simulator  (provided  as  a  demo  in  the  latest  release 
of  Soar,  Soar  8.3)  might  provide  a  path  for  this. 

The  right-hand  side  of  Figure  6.1  shows  how  future  systems  might  interaet  with 
ModSAF  using  the  same  interfaee  that  users  see  through  a  simulated  eye  and  hand  designed 
to  allow  models  to  interaet  with  synthetie  environments  (Ritter,  Jones,  Baxter,  &  Young, 
1998a).  The  interfaee  to  the  physieal  simulation  eould  no  doubt  be  made  more  regular  and 
easier  to  use  so  that  other  arehiteetures,  sueh  as  Sim  Agent,  eould  be  hooked  up  to  it.  We 
suspeet  this  projeet  might  take  a  good  programmer  familiar  with  ModSAF  about  half-time 
over  a  year  beeause  we  had  a  similar  system  built  in  2  weeks  by  someone  who  was  an  expert 
in  their  graphie  programming  language.  A  mueh  longer  time  should  be  allowed.  This  system 
requires  knowing  ModSAF  very  well  beeause  it  will  make  use  of  all  of  ModSAF  and  may 
require  extending  ModSAF. 


ModSAF 


Figure  6.1:  On  (he  left,  a  functional  description  of  Tac-Air  Soar  and  how  it  uses  ModSAF.  On 
the  right,  a  perceptual  interface  to  ModSAF. 
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6.2.6  A  Framework  for  Integrating  Knowledge 

Currently,  there  are  multiple  knowledge  sets  (as  models)  in  different  simulations  that 
exist  in  multiple  formats.  It  would  be  useful  to  create  a  framework  for  integrating  multiple 
knowledge  sets,  allowing  the  knowledge  to  be  reused  in  different  simulations. 

One  way  to  create  a  framework  for  integrating  knowledge  is  to  create  a  task  editor  that 
eould  take  a  knowledge  set  and  eompile  it  for  different  architectures.  The  editor  would  have 
to  be  based  on  a  high-level  description  of  knowledge,  such  as  generic  tasks  (Wielinga, 
Schreiber,  &  Breuker,  1992).  Tlicse  generic  tasks  would  then  be  eompiled  into  things  such 
as  an  AC T-R  or  Soar  rule-set. 

There  are  potentially  huge  payoffs  from  this  very  high-risk  project.  First,  this  project 
would  provide  a  way  to  reuse  knowledge  in  multiple  simulations.  Seeond,  the  reuse  that 
would  arise  would  help  validate  models  and  might  provide  a  way  forward  for  validating 
architectures.  Third,  this  project  would  provide  another  way  of  documenting  behavior 
models.  The  (presumably)  graphic  representation  would  allow  others  to  browse  and 
understand  the  model  on  a  high  level.  Fourth,  it  would  assist  in  writing  models.  In  most 
cases,  there  are  a  lot  of  low-level  details  in  creating  these  models  that  are  not  of  theoretical 
interest  but  require  attention,  such  as  using  the  same  attribute  name  eonsistently  (recent  Soar 
interfaces  now  support  this).  A  high-level  eompiler  for  knowledge  like  this  would  bring  with 
it  all  the  advantages  traditionally  assoeiated  with  high-level  languages.  When  done  for  Soar, 
the  higher-level  language  allowed  models  to  be  built  two  to  three  times  faster 
(Yost,  1992,  1993). 

PC-Pack  (www.epistemics.co.uk)  is  a  potential  tool  to  start  building  upon. 
Implementing  an  initial,  demonstration  version  of  this  approaeh  would  take  a  good 
programmer  6  to  12  months.  Putting  it  to  use  would  take  longer. 

6.2.7  Methods  for  Comparing  Modeling  Approaches 

We  find  ourselves  in  a  position  where  a  number  of  different  approaehes  to  simulating 
human  behavior  arc  available.  Some  of  these  approaches,  at  least,  are  based  on  datasets 
close  enough  to  see  themselves  as  rivals,  and  make  eompeting  elaims  about  their  suitability 
and  quality.  How  can  we  assess  and  compare  them? 

There  can,  of  course,  be  no  one  method  that  answers  such  a  question.  Earlier  ehapters  of 
this  report  have  diseussed  how  practical  considerations  such  as  usability  and 
communicability  of  models  come  into  play  as  well  as  scientific  qualities  such  as  agreement 
with  data.  Thus,  a  wide  range  of  comments  about  a  model  or  architecture  can  be  relevant  to 
choosing  between  them. 

However,  there  arc  some  methods  available  that  are  too  loose  and  varied  to  constitute  a 
“teehnique”  but  are  uscfiil  nonetheless  for  comparing  and  contrasting  such  differing 
approaehes.  They  take  the  form  of  matrix  exercises^  in  which  a  range  of  modeling 
approaches  are  pitted  against  a  battery  of  concrete  scenarios  to  be  modeled.  Young  and 
Barnard  (1987)  provide  the  basic  rationale  for  such  a  method  and  explain  how  it  can  be  used 
to  judge  the  fit  and  scope  of  a  modeling  approaeh.  They  argue,  first,  that  the  modeling 
approaches  need  to  be  applied  to  concrete  scenarios.  It  is  not  sufficient  to  try  comparing 
approaches  on  the  basis  of  their  "‘features”  or  “eharaeteristics.”  Second,  it  is  important  to  use 
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a  range  of  scenarios.  Taking  just  a  single  case  will  inevitably  introduce  a  bias  towards  or 
against  certain  approaches,  and  will  fail  to  provide  an  indication  of  their  scope.  Young, 
Barnard,  Simon,  and  Whittington  (1989a)  provide  a  short  example  of  such  a  matrix  exercise, 
and  show  how  the  entries  in  the  matrix  can  be  interpreted. 

This  kind  of  matrix  exercise  derives  from  the  idea  of  a  “bake-off’  between  rival 
approaches  but  also  differs  in  important  respects.  There  is  unlikely  to  be  a  “winner,”  one 
approach  that  is  regarded  as  the  best  in  all  respects.  Moreover,  the  matrix  exercise  is 
fundamentally  cooperative  rather  than  competitive.  Instead  of  finding  the  “best”  approach, 
bake-offs  provide  a  tool  for  probing  the  scope  of  applicability  of  the  different  approaches, 
and  investigating  their  relative  strengths  and  weaknesses,  advantages  and  disadvantages,  for 
later  modification  and  fusion.  Pew  and  Mavor  appear  to  call  for  this  kind  of  activity  (1998, 
pp.  336-339)  as  well. 

Some  exercises  of  this  kind  have  been  performed  in  public.  At  the  Research  Symposia 
associated  with  the  CHI  conferences  in  1993  and  1994,  Young  (in  1993)  and  Young  and  C. 
Lewis  (in  1994)  organized  such  matrix  exercises  on  the  design  of  an  undo  facility  for  a 
shared  editor  (1993),  and  on  the  analysis  of  the  persistent  unselected  window  error  and  of 
the  design  of  an  automated  bank-teller  machine  as  a  walk-up-and-use  device  (in  1994). 
Furthermore,  there  are  precedents  for  such  an  exercise  in  a  military  research  context.  In 
1993,  NASA  funded  a  comparative  study  of  models  of  pilot  checklist  completion.  The 
Office  of  Naval  Research  has  funded,  on  a  longer  time  scale,  multiple  analyses  and 
modeling  of  several  interactive  tasks  using  hybrid  architectures  (Gigley  &  Chipman,  1999). 
The  speech  recognition  community  in  the  United  States  uses  this  approach  in  a  quite 
competitive  way  as  well. 

The  U.S.  Air  Force  has  recently  started  a  similar  program  called  Agent-Based  Modeling 
and  Behavior  Representation  (AMBR)  to  explore  models  of  complex  behavior 
(www.williams.af.mil/html/ambr.html).  This  multi-team  project  comparing  four  cognitive 
architectures  was  recently  reported  at  the  2001  Computer  Generated  Forces  Conference.  For 
an  overview,  see  Gluck  and  Pew  (2001a;  2001b);  Tenney  and  Spcctor  (2001)  provide  a 
summary  of  the  model  to  data  fits  in  the  most  recent  comparison  round.  Several  more 
iterations  of  comparisons  across  architectures  using  different  types  of  tasks  are  planned. 

A  final  but  important  point  about  such  an  exercise  is  that  it  cannot  be  done  successfully 
inexpensively.  The  exercise  requires  earmarked  and  realistic  funding  to  provide  useful 
results.  A  considerable  amount  of  work  is  required:  first  in  negotiating,  agreeing,  and  then 
specifying  a  set  of  concrete  and  clearly  described  scenarios,  ideally  with  associated 
empirical  data;  and  then  subsequently  for  applying  the  modeling  approaches  to  the 
scenarios,  performing  the  comparisons,  and  drawing  conclusions.  Multiple  research  groups 
are  used,  and  the  funding  has  been  leveraged  by  the  groups’  existing  work  and  multiple 
funding  sources. 

6.2.8  (Re)lmplementing  the  Battlefield  Simulation 

There  are  strong  arguments  for  implementing  communicating  agents  and  intra-agent 
processes  in  JAVA.  These  are  discussed  in  Bigus  and  Bigus  (1997),  and  in  the  context  of 
JACK  Intelligent  Agents  by  Busetta  et  al.  (1995b).  In  fact,  there  arc  powerful  arguments  for 
building  the  entire  synthetic  agent  simulation  in  JAVA  as  described  below.  This  is  possible 
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within  the  Higher-Level  Architecture  (HLA)  framework.  Implementing  Soar  in  JAVA  has 
also  been  mooted  (Schwamb,  1998),  as  well  as  ACl'-R  (see  www.jactr.sourceforge.net  for 
information  on  a  preliminary  JAVA  implementation  of  ACT-R  4,  as  of  May  2001 ),  although 
the  usability  of  these  architectures  would  suffer  for  this. 

A  core  system  implementation  is  needed  that  can  then  be  accessed  through  Application 
Programming  Interfaces  (APIs).  Supporting  software  is  available  for  this,  but  any  software 
could  be  developed  for  the  purpose,  provided  it  conformed  with  the  standard.  The  core 
system  might  be  written  in  JAVA  or  any  other  language  provided  only  that  an  API  is 
implemented.  Similarly,  entities  may  be  written  in  any  language,  or  several,  provided  that 
they  set  up  calls  to  the  API  specification.  There  arc  a  number  of  arguments  for  using  JAVA 
as  the  basis  for  both  individual  entity  simulation  and  for  building  a  core  system  to  the  HLA 
specification.  These  are  described  below. 

The  single  most  attractive  advantage  of  developing  a  synthetic  battlefield  simulation  within 
a  JAVA  environment  lies  in  the  capabilities  available  within  a  Remote  Method  Invocation 
(RMI)  that  forms  part  of  the  JAVA  run-time  environment.  This  is  a  distributed  object  model 
with  some  similarities  to  Microsoft’s  Distributed  Component  Model  (DCOM)®  but  with  the 
advantage  that  it  is  effective  on  any  platform  that  supports  a  JAVA  run-time  environment.  It 
goes  well  beyond  traditional  remote  procedure  calls  being  entirely  object-based,  even  allowing 
objects  to  be  passed  as  parameters.  Object  behavior  as  well  as  data  can  be  passed  to  a  remote 
object  in  a  seamless  and  transparent  way.  A  mortar  weapon  being  passed  as  an  argument  to  an 
individual  infantry  man  entity  and  arriving  complete  with  its  complement  of  munitions  and 
ability  to  be  fired  gives  a  picture  of  this  capability.  The  JAVA  run-time  environment  also 
supports  a  naming  and  directory  service  API  (JAVA  JNDI)  that  allows  the  objects  of  RMI  calls 
to  be  found.  (For  more  details  of  this  see  www.javasoft.com/products/jndi/index.html.) 

To  show  how  such  a  service  might  be  used,  suppose  that  a  simulation  of  an  individual 
paratrooper  has  been  developed.  This  simulation  is  a  uniquely  named  JAVA  object  that  can 
be  invoked  on  any  machine  on  the  network  used  for  the  simulation.  The  JAVA  Naming  and 
Directory  Interface  (JNDI)  service  will  inform  a  process  about  which  machines  have  a 
suitable  simulation  available.  To  take  this  an  important  stage  further,  wc  use  a  class-factory 
object  to  produce  the  individual  paratrooper  objects.  This  class  factory  might  use 
randomized  parameters  to  make  each  entity  distinct  but  fitting  a  known  distribution  (like 
Cabbage-Patch  Dolls®).  To  introduce  these  entities  into  the  simulation,  a  process  would  ask 
the  naming  service  for  a  suitable  class-factory  object.  This  might  be  on  one  of  any  number 
of  machines  and  is  therefore  extremely  robust  against  damage  to  the  network.  The  class 
factory  can  then  be  asked  to  produce  any  number  of  paratroop)er  entities,  each  of  which  (in 
JAVA)  is  capable  of  serializing  itself  to  any  other  machine  on  the  network,  and  running 
there.  Indeed,  the  simulation  can  be  moved  from  machine  to  machine  at  will,  perhaps  in 
response  to  a  condition  such  as  imminent  power  failure. 

This  approach  would  also  support  testing  new  platforms.  A  manufacturer  might  develop 
an  improved  simulation  of  a  Tornado  fighter-bomber.  They  then  could  introduce  a  new 
machine  with  a  suitably  registered  class-factory  object.  Once  this  was  connected  to  the 
network,  the  new  simulation  would  be  immediately  available  even  if  this  were  done  while  a 
simulation  was  running.  No  relinking,  recompilation,  or  even  pause  in  the  simulation  would 
be  needed.  The  objects  could  be  defined  in  conformity  with  the  IILA  standard. 
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JAVA  also  supports  secure  communications  and  has  well-developed  APIs  for  database 
connectivity  and  for  driving  graphics  devices.  An  attractive  user  interface  is  very  much 
easier  to  develop  using  the  JAVA  Foundation  Classes  (JFC)  than,  for  example,  using  X- 
Motif.  In  addition,  if  a  Just  In  I'ime  (JIT)  compiler  is  available  to  the  RTF,  programs 
developed  in  JAVA  show  little  performance  degradation  in  comparison  with  C++. 

A  synthetic  environment  eould  be  developed  using  facilities  offered  by  the  JAVA  run¬ 
time  environment  and  existing  APIs  that  would  eome  much  closer  than  existing  simulations 
in  meeting  the  design  goals  of  maintainability,  versatility,  and  robustness.  This  approach 
would  have  to  be  agreed  upon  by  multiple  communities  and  requires  a  large  amount  of 
resources  to  be  applied  uniformly. 

6.3  Projects  Improving  Usability 

The  projects  presented  here  roughly  address  the  issues  raised  in  Chapter  4.  This  section 
reviews  several  possible  projects  for  making  model  building  more  routine.  For  practical 
reasons,  it  is  useful  to  make  the  model-building  process  more  routine.  It  is  also  important  for 
theoretical  reasons.  If  the  models  cannot  be  created  within  a  time  commensurate  with 
gathering  data,  the  majority  of  the  work  will  continue  to  be  data  gathering  because  theory 
development  will  be  seen  as  too  difficult. 

6.3.1  Defining  the  Modeling  Methodology 

There  is  not  yet  a  definitive  approach  or  handbook  for  building  models  that  can  also  be 
used  for  teaching  and  practicing  modeling  cognitive  behavior.  Newell  and  Simon’s  (1972) 
book  is  too  long  and  mostly  teaches  by  example.  Ericsson  and  Simon’s  (1993)  book  on 
verbal  protocol  analysis  has  comments  on  how  to  create  models;  although  useful,  the 
comments  are  short.  VanSomeren,  Barnard,  and  Sandberg  (1994)  provide  a  useful  text, 
although  it  is  slightly  short  and  some  of  the  details  of  going  from  model  to  data  are  not 
specified  (if  indeed  they  can  be).  Baxter’s  (1997)  report  and  Yost  and  Newell’s  (1989) 
article  are  useful  examples  of  the  process,  but  both  are  tied  to  a  single  architecture  and  not 
widely  available.  There  are  other  useful  papers  worth  noting,  but  they  are  short  and  not 
comprehensive  (e.g.,  Kieras,  1985;  Ritter  &  Larkin,  1994;  Sun  &  Ling,  1998). 

Rouse  (1980)  has  also  made  an  attempt  at  describing  the  modeling  process.  He 
identifies  the  following  steps  as  forming  an  important  part  of  the  modeling  process: 
(1)  definition,  (2)  representation,  (3)  calculation,  (4)  experimentation,  (5)  comparison,  and 
(6)  iteration.  Rouse  mainly  focuses  on  the  representation  and  calculation  aspects  of 
modeling,  particularly  from  an  engineering  point  of  view.  He  describes  several 
methodologies,  including  control  theory,  queuing  theory,  and  rule-based  production 
systems.  He  also  provides  a  short  tutorial  on  several  of  these  modeling  methods  together 
with  practical  examples  of  systems  engineering  models.  The  examples  are  taken  from  a 
wide  variety  of  domains  including  aviation,  air  traffic  control,  and  industrial  process  control. 
It  is  not  a  complete  treatise  on  human  behavior,  but  does  provide  suggestions  for  methods 
that  may  be  useful  in  modeling  certain  aspects  of  human  behavior. 

Similar  tutorials  and  methodological  summaries  should  be  created  until  they  converge. 
The  results  will  be  useful  to  practitioners  and  those  learning  to  model;  the  latter  will  be  an 
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important  audience  as  this  field  grows.  The  output  is  most  likely  to  require  a  textbook.  A 
year  to  several  years  of  support  would  significantly  help  create  this  set  of  learning  materials. 

6.3.2  Individual  Data  Modeling:  An  Approach  for  Validating  Models 

What  is  the  best  way  to  make  theoretical  progress  in  the  study  of  behavior?  Is  it  to 
develop  micro-theories  that  explain  a  small  domain  or  to  aim  at  a  higher  goal,  and  develop 
an  overarching  theory  covering  a  large  number  of  domains — a  unified  theory?  Modem 
psychology,  as  a  field,  has  tended  to  prefer  micro-theories.  Unified  theories  have  regularly 
appeared  in  psychology — think  of  Piaget’s  (1954)  or  Skinner’s  (1957)  theories  but  it  is 
generally  admitted  that  such  unified  theories  have  failed  to  offer  a  rigorous  and  testable 
picture  of  the  human  mind.  Given  this  relatively  unsuccessful  history,  it  was  with  interest 
that  cognitive  science  observed  Newell’s  (1990;  see  also  Newell,  1973)  call  for  a  revival  of 
unified  theories  in  psychology. 

One  of  the  reasons  for  the  limited  success  of  Newell’s  own  brand  of  UTC  is  that  the 
methodology  commonly  used  in  psychology,  based  on  controlling  potentially  confounding 
variables  by  using  group  data,  is  not  the  best  way  forward  for  developing  UTCs.  Instead, 
Gobet  and  Ritter  (2000)  propose  an  approach,  which  they  call  Individual  Data  Modeling 
(IDM),  where  (1)  the  problems  related  to  group  averages  are  alleviated  by  analyzing 
subjects  individually  on  a  large  set  of  tasks,  (2)  there  is  a  close  interaction  between  theory 
building  and  experimentation,  and  (3)  computer  tcclmology  is  used  to  routinely  test  versions 
of  the  theory  on  a  wide  range  of  data.  They  claim  that  there  are  significant  advantages  here, 
that  this  approach  will  also  help  traditional  approaches  progress,  and  that  the  main  potential 
disadvantage— lack  of  generality — may  be  taken  care  of  by  adequate  testing  procedures. 

IDM  offers  several  particular  advantages  in  this  area.  It  does  not  require  as  much  data 
because  the  data  will  not  be  averaged  but  compared  on  a  fine-grained  level.  Not  requiring  a 
large  amount  of  data  is  attractive  when  the  data  are  detailed  or  expensive  to  acquire,  or 
where  the  model  makes  detailed  predictions.  The  other  advantage  is  that  it  provides  a  model 
that  produces  more  accurate  behavior  on  a  detailed  level.  It  is  this  detailed  level  of  behavior 
that  will  be  necessary  to  not  only  allow  a  model  to  appear  human  in  a  Turing  test,  but  also 
lead  to  accurate  training  results  because  it  performs  like  a  comparable  colleague  or  foe. 

Work  using  IDM  is  ongoing  at  the  University  of  Nottingham  and  at  Pemisylvania  State 
University.  A  full  test  would  require  one  to  two  years  of  work  to  gather  data  and  compare  it 
with  a  model.  Developing  the  IDM  methodology  and  applying  it  could  be  combined  with 
other  projects,  however,  because  it  is  a  methodology  and  not  a  feature  of  behavior  to  include 
in  a  model. 


6.3.3  Using  Genetic  Algorithms  to  Fit  Data 

There  are  two  potential  uses  of  genetic  algorithms  worth  highlighting.  The  first  is  for 
generating  behavior  as  described  above  in  Section  5.2.1.  The  second  is  for  optimizing 
model-fits  by  adjusting  their  parameters  (Ritter,  1991).  Most  model-fits  have  been 
optimized  by  hand,  which  leads  to  absolute  and  relative  performance  problems.  In  absolute 
terms,  researchers  may  not  be  getting  optimal  performance  from  their  models.  In  relative 
terms,  comparisons  of  hand-optimized  models  may  not  be  fair.  (Sometimes  even  one  model 
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is  optimized  and  the  other  not.)  In  the  case  of  models  with  multiple  parameters  (with 
submodels  to  include),  this  job  is  not  tractable  by  hand. 

The  results  obtained  by  optimizing  models  with  genetic  algorithms  suggest  that 
optimizations  done  by  hand  are  likely  to  be  inferior  to  those  done  by  genetic  algorithms 
(Ritter,  1991)  or  by  other  machine-learning  techniques  (Butler,  2000).  Use  of  genetic 
algorithms  (or  similar  techniques)  would  improve  performance  in  absolute  terms,  provide 
fairer  comparisons  between  models,  and  encourage  the  inclusion  of  parameter  set  behavior 
in  model  comparisons.  Several  years  of  a  PhD  student  working  within  a  project  with  a 
model  to  optimize  is  probably  a  good  way  to  progress  work  in  this  area. 

This  optimization  should  initially  be  done  with  an  existing  model  so  that  the  developers 
of  the  interface  have  a  ready-made  model  and  audience.  The  basic  approach  is  simple  and 
robust,  and  should  be  straightforward  to  demonstrate.  Making  the  optimization  routine  and 
portable  are  separate  and  more  advanced  steps,  so  this  project  could  take  almost  any  amount 
of  resources,  ranging  from  a  month  to  several  years. 

6.3.4  Environments  for  Model  Building  and  Reuse 

There  remains  a  need  for  better  environments  for  creating  models.  Few  modeling 
interfaces  provide  much  support  for  the  user  to  program  at  the  problem-space  level  or  even 
the  knowledge  level,  although  the  COGENT  interface  is  interesting  as  an  example 
of  usability. 

Soar,  in  particular,  needs  a  better  interface.  While  there  is  now  a  modest  interface,  even 
the  latest  versions  of  the  Soar  interface  (Kalus  &  Hirst,  1999;  Laird,  1999;  Ritter  et  al., 
1998b)  are  not  as  advanced  as  many  expert  system  shells  and  are  just  becoming  as 
comprehensive  as  the  previous.  Lisp-based  version  (Ritter  &  Larkin,  1994).  The  Soar 
interface  is,  however,  providing  increasing  amounts  of  support  at  the  symbol  level  (Jones, 
Bauman,  &  Laird,  2001;  Roytam,  2001)  and  higher,  including  model-specific  displays 
(Jones,  1999b).  TAQL  (Yost  &  Newell,  1989)  and  Able  (Ritter  et  al.,  1998b)  have  been 
moderately  successful,  but  modest  attempts  to  create  high-level  tools  in  Soar,  for  example. 
Gratch’s  (1998)  planning-level  interface  should  be  expanded  and  disseminated  as  a 
modeling  interface.  Knowledge  acquisition  tools  and  techniques  (e.g.,  Cottam  &  Shadbolt, 
1998;  O’Hara  &  Shadbolt,  1998)  might  be  particularly  useful  bases  upon  which  to  build. 

Associated  with  this  project  would  be  general  support  for  programming.  This  includes 
lists  of  frequently  asked  questions,  tutorials,  and  generating  models  or  model  libraries 
designed  for  reuse.  These  libraries  should  either  exist  in  each  architecture  or  in  the  general 
task  language  developed  in  the  previous  task.  These  would  serve  as  a  type  of  default 
knowledge  for  use  in  other  applications.  Wc  can  already  envision  libraries  of  interaction 
knowledge  (about  how  to  push  buttons  and  search  menus),  arithmetic,  and  simple 
optimization  like  the  default  knowledge  in  Soar. 

Work  on  improving  the  modeling  interfaces  for  each  architecture  should  be  incorporated 
as  part  of  another  modeling  project  so  that  the  developers  of  the  interface  have  a  ready-made 
audience.  There  are  multiple  additions  that  would  be  useful  and  multiple  approaches  that 
could  be  explored,  so  this  project  could  take  almost  any  amount  of  resources,  ranging  from  a 
month  to  several  years. 


70 


Human  Systems  lAC  SOAR,  2003 


Chapter  6.  Review  of  Recent  Developments  and  Objectives:  Specific  Projects 

6.3.5  Automatic  Model  Building 

Most  process  models  induced  from  protocols  are  created  by  hand.  I'here  has  been  some 
work  to  do  this  automatically  or  semi-automatically  with  machine-learning  teehniques. 
Semi-automatic  model  generation  is  done  in  the  event-structure  modeling  domain  (a 
sociological  level  of  social  events)  by  a  program  called  Ethno  (Heise,  1989;  Heise  &  Lewis, 
1991).  Ethno  iterates  though  a  database  of  known  events  finding  those  without  known 
precursors.  It  presents  these  to  the  modeler,  querying  for  their  precursors.  As  it  runs  it  asks 
the  modeler  to  create  simple  qualitative,  non-variablized  token-matehing  rules  representing 
the  event’s  causal  relationships  based  on  social  and  scientific  processes.  The  result  at  the 
end  of  an  analysis  is  a  set  of  10  to  20  rules  that  shape  sociological  behavior  in  that  area.  In  a 
sense,  the  modeler  is  doing  impasse-driven  programming  (i.e.,  what  is  the  next  precursor  for 
an  uncovered  event  not  provided  by  an  already  existing  rule?).  After  this  step,  or  in  place  of 
it,  the  modeler  can  compare  the  model’s  predictions  with  a  series  of  actions  on  a 
sociological  level  (a  protocol  in  the  formal  sense  of  the  word).  The  tool  notes  which  actions 
could  follow  and  queries  the  modeler  based  on  these.  Where  mismatches  occur,  Ethno  can 
present  several  possible  fixes  for  configuration.  Incorporating  the  model  with  the  analysis 
tool  in  an  integrated  environment  makes  it  more  powerful.  It  would  be  a  short  extension  to 
see  the  social  events  as  cognitive  events  in  a  protocol. 

Stronger  methods  for  building  models  from  a  protocol  are  also  available.  Cirrus 
(VanLehn  &  Garlick,  1987)  and  ACM  (Langley  &  Ohlsson,  1984)  will  induce  decision  trees 
for  transitions  between  states  that  could  be  turned  into  production  rules  given  a  description 
of  the  problem  space,  including  its  elements  and  the  coded  actions  in  the  protocol.  Cirrus 
and  ACM  use  a  variant  of  the  ID3  learning  algorithm  (Quinlan,  1983).  (ID3  induces  rules 
that  describe  relationships  in  data.) 

These  tools  look  like  a  useful  way  to  refine  process  models.  Why  is  automatic  creation 
of  process  models  not  done  more  often?  Perhaps  it  is  because  these  tools  do  not  create 
complete  process  models.  They  take  a  generalized  version  of  an  operator  that  must  be 
specified  as  part  of  a  process  model.  It  could  also  be  that  finding  the  conditions  of  operators 
is  not  the  difficult  problem  but  that  creating  the  initial  process  model  and  operators  is.  It 
could  also  be  that  it  is  harder  to  write  process  models  that  can  be  used  by  these  machine 
learning  algorithms.  In  any  case,  these  methods  should  be  explored  further. 

Diligent  (Angros,  1998),  Instructo-Soar  (Huffman  &  Laird,  1995),  and  Observo-Soar 
(van  Lent,  1999)  are  approaches  to  create  models  in  Soar  that  learn  how  to  perform  new 
tasks  by  observing  behavior  and  inferring  problem-solving  steps  to  duplicate  them.  Related 
models  have  been  used  in  synthetic  environments  (Assanie  &  Laird,  1999;  van  Lent  & 
Laird,  1999).  They  have  had  limited  use  but  suggest  that  learning  through  observation  may 
be  a  way  to  create  models  as  it  is  an  important  way  that  humans  learn.  Their  lack  of  use 
could  simply  be  due  to  the  fact  that  they  are  novel  software  systems.  As  novel  systems  they 
are  probably  difficult  for  people  other  than  their  developers  to  use  and  will  have  to  go 
through  several  iterations  of  improvement  (like  most  pieces  of  software)  before  they  are 
ready  for  outsiders.  With  a  small  user  base  (so  far),  the  need  has  not  forced  software 
development,  which  has  further  decreased  their  potential  audience. 
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Automatic  modeling  tools  need  to  be  developed.  Machinc-lcaming  algorithms  and 
theories  of  cognition  are  developed  enough  that  this  could  be  a  very  fruitful  approach.  A 
several-year  effort  here  could  yield  large  benefits  of  more  routine  modeling. 

6.3.6  Improvements  to  ModSAF 

A  major  problem  with  ModSAF  is  usability.  ModSAF  is  large  and  has  a  complicated 
syntax.  Users  report  problems  learning  and  using  it.  One  way  to  improve  its  usability  might 
be  a  better  interface;  better  manuals  and  training  aids  might  also  be  useful. 

The  approach  used  by  models  of  behavior  to  interact  with  basic  simulation  capabilities 
such  as  ModSAF  needs  to  be  regularized.  A  fundamentally  better  approach  might  be 
possible.  There  exists  an  interface  between  ModSAF  and  Soar  that  partly  provides  a  model 
eye  and  hand,  fhis  eye/hand  could  be  improved  to  provide  a  more  abstract  interface  to 
ModSAF,  one  that  might  be  easier  to  use  (Schwamb  et  al.,  1994). 

One  thing  we  have  repeatedly  noted  is  that  getting  models  to  interact  with  simulations  is 
more  bearable  when  both  are  implemented  within  the  same  development  environment. 
When  they  are  not,  work  proceeds  much  more  slowly  (Ritter  et  al.,  2000;  Ritter  &  Major, 
1995),  requiring  a  mastery  of  both  environments.  The  situation  is  exacerbated  because  the 
development  and  use  of  any  communication  facility  tends  to  be  an  ill-defined  problem  with 
numerous  wild  subproblems  (i.e.,  problems  where  the  time  to  solution  can  be  high  and  with 
a  large  variance,  that  is,  not  easily  predicted).  So,  for  example,  although  the  ModSAF  Tac- 
Air  system  (Tambc,  Johnson,  Jones,  Koss,  Laird,  Rosenbloom,  et  al.,  1995)  appears  as  if  it 
was  developed  using  joint  compilation  techniques,  it  was  probably  difficult  to  use  because  it 
implements  communication  between  ModSAF  and  the  Tac-Air  model  using  sockets. 
Although  informal  communication  with  researchers  in  the  Soar  and  robotics  communities 
suggest  that  the  use  of  sockets  may  be  becoming  more  routine,  this  has  not  always 
been  the  case. 


6.4  Other  Applications  of  Behavioral  Models  in  Synthetic  Environments 

There  arc  numerous  ways  that  behavioral  models  could  be  applied  outside  the  military 
domain.  Wc  will  examine  four  of  them  here. 

The  most  obvious  additional  application  of  the  models  arising  from  approaches 
proposed  in  this  report  is  in  the  provision  of  automated  support  for  system  operators.  This 
support  can  take  the  fonn  of  intelligent  decision-support  systems  or  embedded  assistants 
that  guide  operator  behavior.  There  are  some  existing  applications,  most  notably  the  Pilot’s 
Associate  (Geddes,  1989),  its  derivative.  Hazard  Monitor  (Greenberg,  Small,  Zenyah,  & 
Skidmore,  1995),  and  GASSY  (Wittig  &  Onken,  1992),  all  from  the  aviation  domain.  In  the 
United  Kingdom,  the  Future  Organic  Airborne  early  warning  system  is  attempting  to  insert  a 
knowledge-based  system  into  the  Osprey  aircraft  and  radar  simulation  to  assist  users. 

These  assistants,  because  they  have  a  model  of  what  the  user  is  likely  to  do  next,  should 
be  able  to  assist  the  user:  if  not  by  performing  the  task,  then  by  preparing  materials  or 
information,  or  by  modifying  the  display  to  help  distinguish  between  alternatives  or  make 
performing  actions  easier.  In  the  past,  such  assistants  have  had  only  a  limited  ability  to 
model  users.  With  increased  validity  and  accuracy,  these  models  may  become  truly  useftil. 
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The  second  application  is  in  education  and  training.  The  uses  in  education  have  been 
fairly  well  illustrated  by  Anderson’s  work  with  cognitive  model-based  tutors  (Anderson, 
Corbett,  Koedinger,  &  Pelletier,  1995).  In  training,  behavioral  models  can  be  used  to 
provide  experts  to  emulate  and  the  same  knowledge  can  also  be  used  to  debrief  students’ 
performances  (Ritter  &  Feurzeig,  1988).  The  knowledge  can  also  be  used  to  populate 
adversaries  and  colleagues  in  the  same  environment  (Bloedom  &  Downcs-Martin,  1985). 

Training  needs  exist  outside  the  military  in  several  domains  where  dynamic  models  arc 
necessary.  Mining,  for  example,  is  starting  to  use  virtual  reality  to  train  simple  tasks 
(Hollands,  Denby,  &  Brooks,  1999).  Virtual  reality  is  already  being  used  to  train  hazard- 
spotting,  avoiding  mine  machinery  as  a  pedestrian,  and  driving  vehicles  underground 
(Schofield  &  Denby,  1 995).  A  web  search  on  virtual  reality  and  training  will  indicate  a  wide 
range  of  other  areas  of  application  as  well. 

The  third  application  is  in  entertainment.  This  has  been  proposed  for  some  time  as  an 
application.  A  recent  report  by  the  U.S.  National  Research  Council  (Computer  Science  and 
Telecommunications  Board,  1997)  suggests  that  is  it  possible  to  use  synthetic  environments 
and  the  behavioral  models  in  them  for  entertainment.  This  is  currently  being  done  by  the 
Institute  for  Creative  Technologies  at  the  University  of  Southern  California. 

The  fourth  application  is  in  systems  analysis.  The  behavioral  models  can  be  used  to 
examine  different  system  designs  to  measure  errors,  processing  rates,  or  emergent  strategies. 
To  return  to  mining  again,  truck  models  in  a  simulation  ean  be  used  to  examine  road  layouts 
in  mines  (Williams,  Schofield,  &  Denby,  1998). 

6.5  Summary  of  Projects 

We  have  laid  out  important  objectives  for  models  of  behavior  in  synthetic  environments 
in  the  important  areas  of  providing  more  complete  perfonnance,  increased  integration  of  the 
models  with  each  other  and  with  synthetic  environments,  and  improved  usability  of  the 
models.  A  wide  range  of  funding  bodies  may  be  interested  in  supporting  these  projects 
because  most  of  these  projects  have  both  engineering  and  scientific  results.  They  will  not 
only  improve  engineering  models  of  human  behavior,  but  they  will  also  improve  our 
understanding  of  behavior  and  our  general  scientific  ability  to  predict  and  model  human 
behavior  generally. 

These  proposals,  taken  as  a  whole,  call  for  several  broad  and  general  research  programs. 
They  suggest  several  moderating  variables  that  affect  cognition,  including  emotions  and 
behavioral  moderators,  personality,  and  interactions  with  the  environment,  which  should  be 
included  in  cognitive  architectures.  They  argue  for  creating  or  moving  towards  a  more 
uniform  format  for  data  and  models  and  a  more  clearly  defined  approach  for  modeling. 
There  are  also  several  concrete  suggestions  for  making  modeling  easier  and  more  routine, 
including  providing  more  usable  modeling  environments  and  supporting  automatic  model 
generation.  Finally,  we  were  able  to  suggest  some  further  applications  of  models  of  behavior 
in  synthetic  environments. 
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Description  of  Soar  and  ACT-R 

Soar  and  ACT-R  are  two  of  the  most  commonly  used  cognitive  architectures.  They  can 
be  seen  as  theories  of  cognition  realized  as  sets  of  principles  and  constraints  on  cognitive 
processing,  a  cognitive  architecture  (Newell,  1990).  They  both  provide  a  conceptual 
framework  for  creating  models  of  how  people  perform  tasks.  They  are  thus  similar  to  other 
unified  theories  in  psychology,  such  as  PSl  and  COGENT. 

Both  Soar  and  ACT-R  are  supported  by  a  computer  program  that  realizes  those  theories 
of  cognition.  There  are  debates  as  to  whether  and  how  the  theory  is  different  from  the 
computer  program,  but  it  is  fair  to  say  that  they  are  at  least  highly  related.  It  is  generally 
acknowledged  that  the  program  implements  the  theory  and  there  arc  commitments  in  the 
program  that  must  be  made  to  create  a  running  system  that  are  not  in  the  theory— places 
where  the  current  theory  does  not  say  one  thing  or  another. 

As  cognitive  architectures,  their  designers  intend  them  to  model  the  full  breadth  and 
width  of  human  behavior.  Such  cognitive  architectures,  including  the  ones  discussed  in  this 
report,  do  so  to  a  greater  or  lesser  extent,  usually  with  the  areas  covered  increasing 
monotonically  over  time.  This  approach  to  modeling  human  cognition  is  explained  in  books 
by  Newell  (1990)  and  Anderson  (Anderson,  1993;  Anderson  &  Lebiere,  1998).  These  books 
also  provide  introductions  of  Soar  and  ACT-R. 

Further  infonnation  on  both  Soar  and  ACT-R  are  available  from  the  references  cited 
here,  as  well  as  the  sources  included  in  the  bibliography  at  the  end  of  this  appendix.  The 
sources  in  the  bibliography  were  used  to  write  this  appendix,  particularly  Johnson  (1997), 
Jones  (1996a,  1996b),  and  Ritter  (2001). 

B.1  Background  of  Soar  and  ACT-R 

Soar  and  ACT-R  are  each  based  on  a  set  of  different  theoretical  assumptions,  reflecting, 
largely,  their  different  conceptual  origins.  Soar  was  developed  by  combining  three  main 
elements:  (1)  the  heuristic  search  approach  of  knowledge-lean  and  difficult  tasks,  (2)  the 
procedural  view  of  routine  problem  solving,  and  (3)  a  symbolic  theory  of  bottom-up 
learning  designed  to  produce  the  power  law  of  learning  (Laird,  Rosenbloom,  &  Newell, 
1986).  However,  many  of  the  constraints  on  Soar’s  theoretical  assumptions  consist  of 
general  characteristics  of  intelligent  agents,  rather  than  detailed  behavioral  phenomena. 
Soar’s  outlook  is  more  biased  towards  performance  because  it  arose  out  of  an  Al-bascd 
tradition. 

In  contrast,  ACT-R  grew  out  of  detailed  phenomena  from  memory,  learning,  and 
problem  solving  (Anderson,  1983,  1990;  Singley  &  Anderson,  1989).  ACT-R  is  thus  suited 
more  for  predicting  slightly  lower-level  phenomena,  and  is  slightly  more  suited  for 
predicting  reaction  times  more  accurately,  particularly  for  tasks  under  10  seconds  in 
duration.  These  differences  are  relative;  both  architectures  have  been  used  for  both  high- 
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and  low-levei  models,  with  attention  paid  to  both  performance  and  time  predictions.  ACT- 
R’s  outlook  is  more  biased  towards  predicting  reaction-time  means  and  distributions 
because  it  arose  out  of  a  more  experimental  psychology  tradition. 

B.2  Similarities  Between  Soar  and  ACT-R 

Soar  and  ACT-R  can  be  seen  as  similar  in  numerous  ways.  They  both  have  two  kinds  of 
memory,  declarative  (facts)  and  procedural  (rules),  although  they  represent  these  items 
differently.  Typical  instantiations  of  them  now  have  input  provided  through  a  model  of 
perception  and  output  buffered  through  a  model  of  motor  behavior  (Byrne,  2001;  Chong, 
2001;  Ritter  et  al.,  2000). 

Both  Soar  and  ACT-R  model  behavior  by  reducing  much  of  human  behavior  to 
problem  solving.  Soar  does  this  rather  explicitly,  being  based  upon  Newell’s  information 
processing  theory  of  problem  solving  (Newell,  1968),  whereas  ACT-R  merely  implies  it  by 
being  goal -directed. 

In  both  architectures  these  memories  are  conceptually  infinite,  with  no  provision 
being  made  for  the  removal  of  any  memory  item  in  ACT-R  (the  Soar  architecture  does 
perform  removal  of  declarative  memory,  which  therefore  can  be  seen  as  a  type  of  short¬ 
term  memory).  Manipulation  of  declarative  memory  can  be  accomplished  by  adding  new 
items  or  changing  existing  ones.  For  procedural  memory,  rules  may  only  be  added  to 
both  architectures. 

The  course  of  processing  involves  moving  from  an  initial  state  to  a  specified  goal 
state.  ACT-R  has  only  one  possible  goal  state  (Version  5),  whereas  Soar  may  have 
several  of  them  arranged  in  a  stack.  Movement  between  the  initial  and  goal  states  usually 
involves  the  creation  of  sub-goals  to  accomplish  the  various  parts  leading  up  to  the 
satisfaction  of  the  goal. 

Both  ACT-R  and  Soar  maintain  a  goal  hierarchy  where  each  subsequent  sub-goal 
becomes  the  focus  of  the  system.  In  ACT-R,  these  must  be  satisfied  in  a  serial  manner  and 
in  the  reverse  of  the  order  they  appear  in  the  hierarchy  (which  is  not  directly  visible  to  both 
the  model  and  the  modeler).  Soar  generally  proceeds  in  a  serial  way  as  well,  but  is  capable 
of  removing  (or  solving)  intermediate  sub-goals  should  the  current  problem  solving  resolve 
a  sub-goal  that  is  much  higher  in  the  goal  hierarchy.  This  difference  makes  ACT-R 
potentially  less  reactive,  although  work  is  in  progress  to  make  ACT-R  more  reactive 
(Lebiere,  2001). 

B.3  Differences  Between  Soar  and  ACT-R 

There  are  also  fundamental  differences  between  the  two  architectures.  Soar  only  moves 
between  states  through  changing  the  state  as  part  of  a  decision  procedure,  which  rules  can 
vote  on  but  cannot  directly  cause.  In  Soar,  when  no  more  productions  can  fire,  an  operator  is 
selected  or  a  state  is  modified.  This  whole  process  is  called  a  decision  cycle.  Where  an 
operator  cannot  be  selected  (e.g.,  due  to  preferences  for  the  set  of  operators  conflicting  each 
other  or  not  being  complete),  a  sub-goal  is  created  with  a  goal  to  choose  the  next  operator. 
Movement  between  states  is  done  in  ACT-R  by  firing  productions,  which  may  change  the 
state  and  goal  stack  directly. 
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Soar  allows  multiple  rules  to  fire  in  parallel.  This  may  lead  to  impasses  because  the 
knowledge  in  the  rules  may  suggest  different  operators,  but  problem  solving  is  available  to 
resolve  this.  In  ACT-R,  when  the  conditions  of  several  productions  are  met,  a  conflict 
resolution  mechanism  selects  the  production  that  it  estimates  to  have  the  highest  gain. 

Learning  in  Soar  occurs  only  for  production  memory.  New  rules  are  created  by  the 
architecture  whenever  a  sub-goal  is  resolved,  such  that  when  next  encountering  the  same 
situation,  the  new  production  fires  without  the  need  to  enter  a  new  sub-goal.  This  type  of 
information  can  include  which  operator  to  select,  or  how  to  implement  an  operator.  These 
rules  tend  to  be  atomic,  and  in  nearly  all  cases  can  be  seen  as  immediately  fully  learned. 
This  learning  mechanism  (chunking)  can  implement  a  wide  range  of  learning  effects, 
including  long-term  declarative  memory  learning  for  long-term  declarative  information  is 
represented  solely  as  the  result  of  procedural  memory. 

ACT-R  learning  involves  both  declarative  and  procedural  memory.  When  rules  fire  they 
become  stronger,  and  as  declarative  memories  are  used  more  they  arc  strengthened  as  well. 
Each  production  also  has  an  expected  gain  value  based  on  its  probability  of  success  and  its 
cost  and  the  current  goal’s  value.  The  expected  gain  is  used  for  conflict  resolution;  the 
production  with  the  highest  expected  gain  is  selected  when  several  productions  are  possible 
matches.  The  more  often  the  production  meets  with  later  success  (c.g.,  the  sub-goal  ends  up 
being  solved),  the  higher  this  probability  for  the  rule  will  become.  This  strength  also 
influences  the  activation  of  the  declarative  memory  items  that  are  matched  by  the  condition 
of  the  production,  and  also  the  rule  execution  time. 

Each  item  in  declarative  memory  has  an  associated  activation  that  changes  based  upon 
how  often  it  has  been  used,  and  how  strongly  it  is  associated  with  other  items  that  arc  being 
used.  The  more  often  an  item  is  used,  the  higher  its  base  level  activation  will  become.  The 
more  strongly  associated  an  item  is  with  ones  that  are  being  used,  the  more  chance  that  item 
has  for  having  its  activation  raised. 

A  rule  learning  mechanism  is  less  often  used  in  ACT-R  models,  and  when  it  has  been 
used,  the  resulting  rules  are  typically  created  in  a  nascent  state  such  that  they  have  to  be 
created  several  times  before  they  are  fully  learned. 

B.4  Bibliography  for  Soar  and  ACT-R 

ai.eccs.umich.edu/soar/,  the  Soar  Group’s  homepage 
act.psy.cmu.edu/,  the  ACT-R  Group’s  homepage 
acs.ist.psu.edu/soar-faq,  Soar  Frequently  Asked  Questions  list 
acs.ist.psu.edu/act-r-faq,  ACT-R  Frequently  Asked  Questions  list 

Jones,  G.  (1996).  The  architectures  of  Soar  and  ACT-R,  and  how  they  model 
human  behaviour.  Artificial  Intelligence  and  Simulation  of  Behaviour  Quarterly,  96 
(Winter),  41-44. 
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Johnson,  T.  R.  (1997).  Control  in  ACT-R  and  Soar.  In  M.  Shafto  &  P.  Langley  (Eds.), 
Proceedings  oj  the  Nineteenth  Annual  Conference  of  the  Cognitive  Science  Society  (pp.  343- 
348).  Hillsdale,  NJ:  Erlbaum, 

Ritter,  F.  E.  (2002).  Soar,  In  Encyclopedia  of  cognitive  science,  London:  Macmillan, 
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Glossary  of  Acronyms  and  Abbreviations 


ABC 

A*  search  with  Bounded  Costs 

ACT-R 

ACT-R/PM 

Adaptive  Control  of  Thought  -  Rational 

A  pcrccptual-motor  component  added  to 

ACT-R 

AI 

Artificial  Intelligence 

AMBR 

Agent-Based  Modeling  and  Behavior 

Representation  project 

APEX 

A  tool  for  applied  human  perfonnance 
modeling  developed  at  NASA 

API 

Application  Programing  Interface 

ATAl.  workshops 

Architectures,  Theories,  And  Languages 

Workshop  series 

BDI  architectures 

Architectures  based  on  representing  Beliefs, 

Desires,  and  Intentions 

CES 

CHIRP 

Cognitive  Environment  Simulation 

Confidential  Human  Factors  Incident 

Reporting  Program 

CHREST 

CM  AC 

Chunk  Hierarchy  and  REtrieval  STruetures 

Cerebellar  Model  Arithmetic  Computer 

CoCoM 

COSIMO 

Contextual  Control  Model 

cognitive  SIMulation  MOdel 

CREAM 

Cognitive  Reliability  and  Error  Analysis 

Method 

DERA 

Defence  Evaluation  and  Research  Agency 
(UK) 

DCOM 

DIS 

EPAM 

Distributed  COmponent  Model 

Distributed  Interactive  Simulation  (system) 

Elementary  Pereeiver  and  Memoriscr 

EPIC 

A  cognitive  architecture  based  on  a 
production  rule  interpreter  that  assumes  no 
cognitive  limitations  on  processing  and  a  set 
of  perceptual  motor  processors  that  provide  a 
limitation  on  cognition. 

FLAME 

GAs 

Fuzzy  Logic  Adaptive  Model  of  Emotions 

Genetic  Algorithms 
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HCI 

HLA 

IDM 


IMPS 

JACK 

JAVA 

JFC 

JNDI 

KBS 

LTM 

MLP 

ModSAF 

NDM 

ONR 

RDM 

RMI 

SDM 

SEs 

SMOC 

SRG 

STM 

UTC 


Human-Computer  Interaction 
Higher-Level  Architecture 

Individual  Data  Modeling,  modeling  based  on 
fitting  the  behavior  of  individuals  and  then 
aggregating  the  results,  as  compared  with 
fitting  data  aggregated  across  subjects. 

Internet-based  Multi-agent  Problem  Solving 
JAVA  Agent  Compiler  and  Kernel 

A  procedural  language  used  to  support  web 
applications 

JAVA  Foundation  Classes 
JAVA  Naming  and  Directory  Interface 
Knowledge-Based  Systems 
Long-Term  Memory 
Multi-Layer  Perceptron 
Modular  Semi-Automated  Forces 
Naturalistic  Decision  Making 
Office  of  Naval  Research 
Rapid  Decision  Making 
Remote  Method  Invoeation 
Sparse  Distributed  Memory 
Synthetic  Environments 
Simplified  Model  Of  Cognition 
System  Response  Generator 
Short-Term  Memory 
Unified  Theory  of  Cognition 
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